Reputation: 2293
I have two 2x2 data frames. Each column in each data frame is a factor.
I want to create a 2x8 data frame that contains each factor and the interactions between factors.
Here is an example:
df1 <- data.frame(V1 = factor(c('a', 'b')), V2 = factor(c('c', 'd')))
df2 <- data.frame(V3 = factor(c('e', 'f')), V4 = factor(c('g', 'h')))
df.combined <- combine(df1, df2)
Where df.combined
would be
V1 V2 V3 V4 V1:V3 V1:V4 V2:V3 V2:V4
a c e g a:e a:g c:e c:g
b c f h b:f b:h d:f d:h
(I don't want the V1:V2 or V3:V4 interactions. Not needing those interactions is just in the nature of the problem that I face.)
Is there a succinct way to get df.combined
in R?
Upvotes: 2
Views: 2278
Reputation: 688
If the colons in the name are not required, it is just one line of code that takes care of both column binding the two data frames and creating the interactions. Using your two data frames:
df.combined <- with(c(df1, df2), data.frame(df1, df2, V1:V3, V1:V4, V2:V3, V2:V4))
which gives
V1 V2 V3 V4 V1.V3 V1.V4 V2.V3 V2.V4
1 a c e g a:e a:g c:e c:g
2 b d f h b:f b:h d:f d:h
If you need the colons in the names, another oneliner will change periods to colons:
colnames(df.combined) <- gsub("\\.", ":", colnames(df.combined))
leaving the final results
V1 V2 V3 V4 V1:V3 V1:V4 V2:V3 V2:V4
1 a c e g a:e a:g c:e c:g
2 b d f h b:f b:h d:f d:h
Upvotes: 0
Reputation: 263332
I'm not usre it this meet your definition of "succintly".
dfc <- cbind(df1,df2)
dfc2<- cbind( dfc, `V1:V3`=interaction(dfc$V1, dfc$V3, sep=":"),
`V1:V4`=interaction(dfc$V1,dfc$V4, sep=":") )
df.combined <- cbind( dfc2, `V2:V3`=interaction(dfc$V2, dfc$V3, sep=":"),
`V2:V4`=interaction(dfc$V2,dfc$V4, sep=":") )
> df.combined
V1 V2 V3 V4 V1:V3 V1:V4 V2:V3 V2:V4
1 a c e g a:e a:g c:e c:g
2 b d f h b:f b:h d:f d:h
(It is generally not recommended to have colons in variable names. They will then always need to be quoted.
Upvotes: 2
Reputation: 14667
Here is one solution. Maybe not terribly elegant or succinct, but possibly useful...
dat <- data.frame(V1=c("a", "b"),
V2=c("c", "d"),
V3=c("e", "f"),
V4=c("g", "h"))
factor_pairs <- expand.grid(c("V1", "V2"),
c("V3", "V4"),
stringsAsFactors=FALSE)
for (i in 1:nrow(factor_pairs)) {
factor_1 <- factor_pairs[i, 1]
factor_2 <- factor_pairs[i, 2]
new_col_name <- paste(factor_1, factor_2, sep=":")
dat[[new_col_name]] <- paste(dat[[factor_1]], dat[[factor_2]], sep=":")
}
dat
# V1 V2 V3 V4 V1:V3 V2:V3 V1:V4 V2:V4
# 1 a c e g a:e c:e a:g c:g
# 2 b d f h b:f d:f b:h d:h
Upvotes: 0