Ester Silva
Ester Silva

Reputation: 680

Row Cumulative in a dataframe considering the Dates - R

I have a data frame, which each row contains the points for each user that are obtained in certain dates (dates are presented in the last row).

Example:

         X1         X2          X3          X4          X5          X6
user1   123         0           324         8734        435         86
user2   34          63          65          35          566         92  
user3   45          54          8764        0           8976        874     
user4   0           21          7653        974         4235        324 
user5   5           647         842         2345        29          7652
Dates   2010-03-12  2010-03-12  2010-03-13  2010-03-14  2010-03-14  2010-03-14

I want to accumulate the values for each row that belong to a date. Example (having the above table):

        X1          X2          X3
user1   123         447         9702
user2   97          162         855     
user3   99          8863        18713
user4   21          7674        13207
user5   652         1494        11520
Dates   2010-03-12  2010-03-13  2010-03-14 

I could do it using a for loop but I know it is not an efficient solution. So, I am looking for an efficient way to do it.

Thanks!

Upvotes: 0

Views: 65

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388797

As suggested by @yarnabrina we could transpose convert factor/character columns to numeric group_by Dates and sum and finally transpose it again.

library(dplyr)

data.frame(t(df)) %>%
   mutate_at(vars(starts_with("user")), ~as.numeric(as.character(.))) %>%
   group_by(Dates) %>%
   summarise_all(sum) %>%
   ungroup() %>% t %>% data.frame()

#              X1         X2         X3
#Dates 2010-03-12 2010-03-13 2010-03-14
#user1        123        324       9255
#user2         97         65        693
#user3         99       8764       9850
#user4         21       7653       5533
#user5        652        842      10026

Or another approach using base R is to split the columns based on last row, convert them to numeric and take row-wise sum.

sapply(split.default(df[-nrow(df), ], unlist(df[nrow(df), ])), 
         function(x) {x[] <- lapply(x, as.numeric);rowSums(x)})

#      2010-03-12 2010-03-13 2010-03-14
#user1        123        324       9255
#user2         97         65        693
#user3         99       8764       9850
#user4         21       7653       5533
#user5        652        842      10026

Upvotes: 1

Related Questions