GNicoletti
GNicoletti

Reputation: 204

Calculate all possible product combinations between variables

I have a df containing 3 variables, and I want to create an extra variable for each possible product combination.

test <- data.frame(a = rnorm(10,0,1)
                   , b = rnorm(10,0,1)
                   , c = rnorm(10,0,1))

I want to create a new df (output) containing the result of a*b, a*c, b*c.

output <- data.frame(d = test$a * test$b
                        , e = test$a * test$c
                        , f = test$b * test$c)

This is easily doable (manually) with a small number of columns, but even above 5 columns, this activity can get very lengthy - and error-prone, when column names contain prefix, suffix or codes inside.

It would be extra if I could also control the maximum number of columns to consider at the same time (in the example above, I only considered 2 columns, but it would be great to select that parameter too, so to add an extra variable a*b*c - if needed)

My initial idea was to use expand.grid() with column names and then somehow do a lookup to select the whole columns values for the product - but I hope there's an easier way to do it that I am not aware of.

Upvotes: 0

Views: 300

Answers (2)

TarJae
TarJae

Reputation: 78917

Could this one also be a solution. Ronak's solution is more elegant!

library(dplyr)

# your data
test <- data.frame(a = rnorm(10,0,1)
                   , b = rnorm(10,0,1)
                   , c = rnorm(10,0,1))

# new dataframe output
output <- test %>% 
  mutate(a_b= prod(a,b),
         a_c= prod(a,c),
         b_c= prod(b,c)
         ) %>% 
  select(-a,-b,-c)

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388807

You can use combn to create combination of column names taken 2 at a time and multiply them to create new columns.

cbind(test, do.call(cbind, combn(names(test), 2, function(x) {
  setNames(data.frame(do.call(`*`, test[x])), paste0(x, collapse = '-'))
}, simplify = FALSE)))

#.           a          b          c        a-b       a-c        b-c
#1   0.4098568 -0.3514020  2.5508854 -0.1440245  1.045498 -0.8963863
#2   1.4066395  0.6693990  0.1858557  0.9416031  0.261432  0.1244116
#3   0.7150305 -1.1247699  2.8347166 -0.8042448  2.026909 -3.1884040
#4   0.8932950  1.6330398  0.3731903  1.4587864  0.333369  0.6094346
#5  -1.4895243  1.4124826  1.0092224 -2.1039271 -1.503261  1.4255091
#6   0.8239685  0.1347528  1.4274288  0.1110321  1.176156  0.1923501
#7   0.7803712  0.8685688 -0.5676055  0.6778060 -0.442943 -0.4930044
#8  -1.5760181  2.0014636  1.1844449 -3.1543428 -1.866707  2.3706233
#9   1.4414434  1.1134435 -1.4500410  1.6049658 -2.090152 -1.6145388
#10  0.3526583 -0.1238261  0.8949428 -0.0436683  0.315609 -0.1108172

Upvotes: 4

Related Questions