Dan Lewer
Dan Lewer

Reputation: 956

R: adding rows in a data frame depending on another variable

I'm trying to do a kind of conditional rowSums.

I have a data frame with four columns containing 1's and 0's, and another variables that indicates which columns should be added to make the row totals.

For example:

df <- matrix(rbinom(40, 1, 0.5), ncol = 4)
df <- as.data.frame.matrix(df)
df$group <- sample(c('12', '123', '1234'), 10, replace = T)

If the group is 12, then columns V1:V2 should be added, if 123 then V1:V3, and if 1234 then columns V1:V4.

I've tried a labour-intensive approach:

df$total12 <- rowSums(df[,c('V1', 'V2')])
df$total123 <- rowSums(df[,c('V1', 'V2', 'V3')])
df$total1234 <- rowSums(df[,c('V1', 'V2', 'V3', 'V4')])
df$total <- ifelse(df$group == '12', df$total12,
                   ifelse(df$group == '123', df$total123, df$total1234))

Is there a simpler way to do this?

Upvotes: 3

Views: 129

Answers (2)

Dave2e
Dave2e

Reputation: 24139

Here is another option using the switch function. This is more readable and easier to extend then a series of nested ifelse statements.

df$total<-sapply(1:length(df$group), function(i){switch(df$group[i], 
            "12"=rowSums(df[i, c('V1', 'V2')]),
            "123"=rowSums(df[i, c('V1', 'V2', 'V3')]),
            "1234"=rowSums(df[i, c('V1', 'V2', 'V3', 'V4')]))})

Basically, loops through the elements of df$group and selects the proper formula to use. If your dataset isn't too long, performance should be acceptable.

Upvotes: 1

akrun
akrun

Reputation: 887951

Here is an option. We create a row/column index by splitting the 'group', extract the values of 'df' based on the index and get the sum grouped by the row index

lst <- strsplit(df$group, "")
i1 <- cbind(rep(seq_len(nrow(df)), lengths(lst)), as.integer(unlist(lst)))
df$total <- ave(df[-5][i1], i1[,1], FUN = sum)

Upvotes: 1

Related Questions