Reputation: 311
I have a question regarding the expand.grid() function using two data frames as opposed to two vectors. I want to combine two data frames and all their possible combinations together while simply subtracting all other variables. For example...
df1 <- data.frame('USC', '2.3', '1.3', '5.4')
df2 <- data.frame('Texas', '1.2', '-1.4', '2.3')
So basically I can get all the combinations of the first variable using the expand.grid() function to look like 'USC Texas, Texas USC' etc... but I want to also subtract or find the difference between the rest of the variables associated in the dataframe. For example...
('USC Texas', '1.1', '2.7', '3.1')
('Texas USC', '-1.1', -2.7', '-3.1')
Can I somehow combine the expand.grid() function with apply? Any help would be appreciated
Upvotes: 1
Views: 1722
Reputation: 44614
This is another way:
# clean up the data. Put df1 and df2 into one data.frame and convert the columns
# to their natural data type. Name the columns.
names(df2) <- names(df1)
d <- rbind(df1, df2)
names(d) <- letters[1:4]
d[] <- lapply(d, function(col) type.convert(as.character(col)))
# a b c d
#1 USC 2.3 1.3 5.4
#2 Texas 1.2 -1.4 2.3
# get the cartesian product of d with itself
x <- merge(d, d, by=character(0))
x <- subset(x, a.x != a.y)
x <- within(x, {
a <- paste(a.x, a.y)
b <- b.x - b.y
c <- c.x - c.y
d <- d.x - d.y
})
x[c('a', 'b', 'c', 'd')]
# a b c d
# 2 Texas USC -1.1 -2.7 -3.1
# 3 USC Texas 1.1 2.7 3.1
Upvotes: 1
Reputation: 81683
Here's an approach:
mapply(function(x, y) if (!grepl("^[+-]?\\d+\\.\\d+$", x))
c(paste(x, y), paste(y, x)) else
c(res <- as.numeric(as.character(x)) - as.numeric(as.character(y)),
-res), df1, df2)
# X.USC. X.2.3. X.1.3. X.5.4.
# [1,] "USC Texas" "1.1" "2.7" "3.1"
# [2,] "Texas USC" "-1.1" "-2.7" "-3.1"
Upvotes: 3