Reputation: 23
So I have this big data set with 32 variables and I need to work with relative values of these variables using all possible subtractions among them. Ex. var1-var2...var1-var32; var3-var4...var3-var32, and so on. I'm new in R, so I would like to do this without going full manually on the process. I'm out of idea, other than doing all manually. Any help appreciated! Thanks!
Ex: df_original
id | Var1 | Var2 | Var3 |
---|---|---|---|
x | 1 | 3 | 2 |
y | 2 | 5 | 7 |
df_wanted
id | Var1 | Var2 | Var3 | Var1-Var2 | Var1-Var3 | Var2-Var3 |
---|---|---|---|---|---|---|
x | 1 | 3 | 2 | -2 | -1 | 1 |
y | 2 | 5 | 7 | -3 | -5 | -2 |
Upvotes: 1
Views: 84
Reputation: 388982
You can do this combn
which will create combination of columns taking 2 at a time. In combn
you can apply a function to every combination where we can subtract the two columns from the dataframe and add the result as new columns.
cols <- grep('Var', names(df), value = TRUE)
new_df <- cbind(df, do.call(cbind, combn(cols, 2, function(x) {
setNames(data.frame(df[x[1]] - df[x[2]]), paste0(x, collapse = '-'))
}, simplify = FALSE)))
new_df
# id Var1 Var2 Var3 Var1-Var2 Var1-Var3 Var2-Var3
#1 x 1 3 2 -2 -1 1
#2 y 2 5 7 -3 -5 -2
data
df <- structure(list(id = c("x", "y"), Var1 = 1:2, Var2 = c(3L, 5L),
Var3 = c(2L, 7L)), class = "data.frame", row.names = c(NA, -2L))
Upvotes: 1