Orhan Yazar
Orhan Yazar

Reputation: 909

Trying to sum different columns number at the same time

Data

Consider you have this data.table or dataframe (i'm working with data.table):

a <- c(1, 6.7, 7.0, 6.5, 7.0, 7.2, 4.2, 5, 6.6,6.7) 
b <- c(2,5.0, 3.5, 4.9, 7.8, 9.3, 8.0, 7.8, 8.0,10)
c <- c(3, 7.0, 5.5, 7.2, 7.7, 7.2, 8.0, 7.6, 7,6.7) 
d <- c(4, 7.0, 7.0, 7.0, 6.9, 6.8, 9.0, 6.0, 6.6,6.7) 
df <- data.frame(rbind(a,b,c,d))

  X1  X2  X3  X4  X5  X6  X7  X8  X9  X10
a  1 6.7 7.0 6.5 7.0 7.2 4.2 5.0 6.6  6.7
b  2 5.0 3.5 4.9 7.8 9.3 8.0 7.8 8.0 10.0
c  3 7.0 5.5 7.2 7.7 7.2 8.0 7.6 7.0  6.7
d  4 7.0 7.0 7.0 6.9 6.8 9.0 6.0 6.6  6.7

Problem

I'm trying to sum X3 and X4 for the first line, X3 and X4 and X5 for the second, etc...

What i did

I have a vector called iter :

iter <- c(1,2,3,4)

And what i did is a for loop

for(i in 1:nrow(df)){
df$sum[i] <- sum(as.numeric(df[i,2:(2+iter[i])]),na.rm=T)}

Do you know a way to do it without a for loop ?

Expected output

output 
   13.7  #correspond to df[1,X3]+df[1,X4]
   13.4  #correspond to df[2,X3]+df[2,X4]+df[2,X5]
   27.4  #correspond to df[3,X3]+df[3,X4]+df[3,X5]+df[3,X6]
   37.4  #correspond to df[4,X3]+df[4,X4]+df[4,X5]+df[4,X6]+df[4,X7]

EDIT

iter <- c(1,2,3,4)

is completely arbitrary here, so i need a solution for any value of iter

Upvotes: 2

Views: 69

Answers (3)

Damiano Fantini
Damiano Fantini

Reputation: 1975

What about this? If iter specifies the # of columns:

iter <- c(2,5,4,2)
  sapply(1: length(iter),(function(i){
    ri <- iter[i]
      sum(df[i, 3:(3+ri-1)])
  }))

If you use it for the order of the rows (like, for reordering the rows in the dataframe)

iter <- c(1,2,3,4)
sapply(1: length(iter),(function(i){
  ri <- iter[i]
    sum(df[ri, 3:(3+i)])
}))

Upvotes: 1

Ape
Ape

Reputation: 1169

Elements of df are factors which complicates the solution a bit. First I turn the relevant columns to numeric matrix.

Edit: with updated version of df without factors

mat <- sapply(df[,-1], as.numeric)
rowSums(mat*cbind(TRUE, lower.tri(mat[,-1], diag = TRUE)))

[1] 13.7 13.4 27.4 34.7

Using arbitrary iter:

index.mat = t(sapply(iter, function(x){rep(c(TRUE,FALSE), times = c(x+1, ncol(df)-x))}))
rowSums(df[,-1]*index.mat)

20.2 38.5 34.6 27.9

Upvotes: 3

lmo
lmo

Reputation: 38510

You can use Reduce with accumulate=TRUE and then extract the values.

# initialize iter variable
iter <- 1:4

# calculate cumulative row sums, dropping initial list element
vals <- Reduce("+", df[2:10], accumulate=TRUE)[-1]

# pull out what you want with recursive indexing and sapply
sapply(1:nrow(df), function(x) vals[[c(iter[x], x)]])
[1] 13.7 13.4 27.4 34.7

Upvotes: 3

Related Questions