How can I group by two variables and create a new variable based on cumsum?

Question

I'm working on a dataset based on hotel reviews. I've created a subset (440880 rows) as follow:

 df2
 Hotel_ID  Review_date  Negative_Rev       Positive_Rev   Negative  Positive
        1   2015/08/20     bad staff   comfortable room          1         1
        1   2015/08/30   No Negative         good staff          0         1
        2   2015/09/24      no staff        No Positive          1         1
        2   2016/02/03  No Breakfast   near city centre          1         1
        2   2016/03/22   No Negative        No Positive          0         0

where Negative and Positive are variables based on Negative_Rev and Positive_Rev (x = 0 if No Negative or No Positive). I would like to group df2 by Hotel_ID and Review_Date and create two new columns called Daily_Negative and Daily_Positive derived from cumsum function of respectively Negative and Positive. I've tried, for example, with this:

> df$Daily_Positive <- ddply(df, .(Review_Date, Hotel_ID), transform, Daily_Positive = cumsum(Positive))

Stefan · Accepted Answer

Here is another soluation using the data.table package:

library(data.table)
df2[, .(Daily_Negative=sum(Negative), Daily_Positive=sum(Positive)), by=.(Hotel_ID, Review_date)]

How can I group by two variables and create a new variable based on cumsum?

Answers (2)

Related Questions