kin182
kin182

Reputation: 403

How to apply a function to every two columns of a data frame without repetitions?

I have a data frame of 2000 rows and 40000 columns and I would like to apply a function to every two columns without repetitions. In the following example, I would like to add the values of every two columns, like V1 and V2, V3 and V4.

set.seed(42)
df <- as.data.frame(matrix(rnorm(16),4))

df
          V1          V2         V3         V4
1  1.3709584  0.40426832  2.0184237 -1.3888607
2 -0.5646982 -0.10612452 -0.0627141 -0.2787888
3  0.3631284  1.51152200  1.3048697 -0.1333213
4  0.6328626 -0.09465904  2.2866454  0.6359504

The desire output will be like

data.frame("V1" = df$V1+df$V2, "V2"=df$V3+df$V4)

          V1         V2
1  1.7752268  0.6295630
2 -0.6708227 -0.3415029
3  1.8746504  1.1715483
4  0.5382036  2.9225958

I am thinking of using combn but it is repetitive. Could anyone help? Thanks!

Upvotes: 1

Views: 62

Answers (2)

cmaher
cmaher

Reputation: 5215

Perhaps the simplest way to do this is to index with two sequences -- one that gives c(1, 3, ...) and another that gives c(2, 4, ...) -- and add the results:

df[, seq(1,length(df),2)] + df[, seq(2,length(df),2)]

#           V1         V3
# 1  1.7752268  0.6295630
# 2 -0.6708227 -0.3415029
# 3  1.8746504  1.1715483
# 4  0.5382036  2.9225958

Upvotes: 1

Melissa Key
Melissa Key

Reputation: 4551

Try using map2_df from the purrr library:

library(purrr)    
map2_df(.x = df[seq(1,ncol(df),2)], .y = df[seq(2, ncol(df), 2)], ~ .x + .y)

#  A tibble: 4 x 2
#       V1     V3
#    <dbl>  <dbl>
# 1  1.78   0.630
# 2 -0.671 -0.342
# 3  1.87   1.17 
# 4  0.538  2.92 

Upvotes: 1

Related Questions