Applying function on iterative combinations of columns in data frame in R

Question

I want to build a particular function, which sums rows of different combinations of columns in whole data frame.

Example - I have a data frame with 1st descriptive column and 20 columns (v1, v2, v3, ...) from 1 to 20, where only possible values are 1 or 0. I want a function, where I can sum values from columns based on specific occurrence. If per example function would be set on 3, then I need that first 3 columns (v1, v2 and v3) would be summed together in first newly created variable (v1s), then next 3 columns would be summed together (v2, v3, v4) in next newly created variable (v2s), etc.

Reproducible example:

set.seed(300) 
db <- matrix(sample(0:1,5*20, replace=TRUE),5,20)
us <- c("A","B","C","D","E")
db <- as.data.frame(cbind(us,db))

So I want to be in control to defining how many columns do I want summed up (by in control I mean it shouldn't be hard to change this to 5 columns per example). In case of 3 columns, I want to create 18 additional vars (all continuing combos of 3 variables from 1st to last) containing row sums of 3 columns.

I tried few options, however I can't really figure out easy way to do it. I was thinking of a function, which takes as input number of columns and then rowsums up values and iterates over all combos, however I can't manage to crack this one. Please help!

Martin Schmelzer · Accepted Answer

There might be an easier way. Anyhow, here is my approach:

Code:

grp_colsum <- function(df, n) {
  # generate the set of indices 123, 234, 345, ...
  idx <- sapply(1:n, `+`, 0:(ncol(df)-n))
  # for each set of indices
  res <- apply(idx, 1, function(x) {
    rowSums(df[,x])
  })
  colnames(res) <- paste0("v", 1:ncol(res), "s")
  res
}
grp_colsum(db[,-1], 3)

Output:

    v1s v2s v3s v4s v5s v6s v7s v8s v9s v10s v11s v12s v13s v14s v15s v16s v17s v18s
[1,]   2   2   3   2   2   1   1   1   1    2    1    2    2    2    2    1    2    1
[2,]   2   2   1   2   2   2   1   1   2    2    2    1    2    1    1    1    1    2
[3,]   2   2   2   2   2   3   3   2   2    2    3    3    2    2    1    2    2    2
[4,]   2   2   2   2   1   2   2   2   2    2    2    2    1    1    1    2    3    3
[5,]   3   3   2   1   0   1   1   2   2    2    1    1    2    3    2    1    1    1

Applying function on iterative combinations of columns in data frame in R

Answers (2)

Related Questions