Reputation: 95
I have the following data
Name Date Message
Ted Foe 2011-06-10T05:06:30+0000 I love this product
Sina Fall 2011-06-10T05:07:33+0000 Not my type of product
Steve Hoe 2011-06-11T05:06:30+0000 Great Discussion! Thanks
Selda Dee 2011-06-13T05:12:30+0000 Seen elsewhere
Steven Hoe 2011-06-13T03:17:31+0000 Where?
Selda Dee 2011-06-13T05:17:56+0000 Tinder
I want to aggregate by days so that I end up with a time series like this
Date Number of Posts
2011-06-10 2
2011-06-11 1
2011-06-12 0
2011-06-13 3
I already tried the following
summary_df <- df %>% group_by(Date) %>% summarise(comments = count(message))
But this is not working. Any quick dplyr based solution would be great.
Thanks for the help!
Cheers, Raoul
Upvotes: 1
Views: 3110
Reputation: 887118
Grouped by the 'Date' column after converting to Date
class, we get the number of rows (n()
) with summarise
. If we need the 'Date' elements that are missing in the original dataset, create a new dataset with the sequence of minimum to maximum 'Date' and do a left_join
df1 <- df %>%
group_by(Date = as.Date(Date)) %>%
summarise(comments = n())
expand.grid(Date = seq(min(df1$Date), max(df1$Date), by = '1 day')) %>%
left_join(., df1)
Upvotes: 4