Reputation: 189
I have a large data frame of 1129 rows and 4662 columns. I want to sum the row values in a data frame at intervals of every 3 columns, and then return 1 for each of these sums if the row sum every 3 columns was >0, or return 0 if the sum<1. I have added a small reproducible example below. I would like to sum the row values of column 1 to column 3, and then the row values from column 4 to column 8 (and so on in my real data).
df <- read.table(text =" 2005-09-23_2005-09-26 2005-09-27_2005-10-30 2005-10-07_2005-10-08 2005-10-09_2005-10-10 2005-10-11_2005-10-12 2005-10-13_2005-10-14
1 1 0 1 1 1 1
2 1 1 0 0 0 0
3 NA NA NA NA NA 0", header = TRUE)
The result I am after would be this:
result <- read.table(text =" 2005-09-23_2005-10-08 2005-10-09_2005-10-14
1 1 1
2 1 0
3 NA 0", header = TRUE)
I looked for similar questions and it seems that rollapply (R: summing over an interval of rows) OR rowsum could work (R: summing over an interval of rows), but I can't find a way to sum rows using columns as intervals instead of rows, nor how to do it in a repetitive sequence. Would someone be so kind to help me with some code for doing this? Thank you very much!
Upvotes: 1
Views: 268
Reputation: 4358
This works only if the number of columns is divisible by the interval.
+(sapply(split.default(df,unlist(lapply(1:(ncol(df)/3),rep,3))),rowSums) > 0)
1 2
1 1 1
2 1 0
3 NA NA
maybe someone else can find a more elegant way of creating the split other than
unlist(lapply(1:(ncol(df)/3),rep,3))
Upvotes: 1