Reputation: 3650
I have multiple observations from different persons on different dates, e.g.
df <- data.frame(id= c(rep(1,5), rep(2,8), rep(3,7)),
dates = seq.Date(as.Date("2015-01-01"), by="month", length=20))
Here we have 3 people (id), with different amount of observations each.
I now want to count the mondays, tuesdays etc for each person.
This should be done using dplyr
and summarize
because my real data set has many more columns which I summarize with different statistics.
It should be some something like this:
summa <- df %>% group_by(id) %>%
summarize(mondays = #numberof mondays,
tuesdays = #number of tuesdays,
.........)
How can this be achieved?
Upvotes: 0
Views: 2399
Reputation: 14988
Base Date functions:
summa <- df %>% group_by(id) %>%
summarise(monday = sum(weekdays(dates) == "Monday"),
tuesday = sum(weekdays(dates) == "Tuesday"))
Upvotes: 1
Reputation: 70336
I would do the following:
summa <- count(df, id, day = weekdays(dates))
# or:
# summa <- df %>%
# mutate(day = weekdays(dates)) %>%
# count(id, day)
head(summa)
#Source: local data frame [6 x 3]
#Groups: id [2]
#
# id day n
# (dbl) (chr) (int)
#1 1 Donnerstag 1
#2 1 Freitag 1
#3 1 Mittwoch 1
#4 1 Sonntag 2
#5 2 Dienstag 2
#6 2 Donnerstag 1
But you can also reshape to wide format:
library(tidyr)
spread(summa, day, n, fill=0)
#Source: local data frame [3 x 8]
#Groups: id [3]
#
# id Dienstag Donnerstag Freitag Mittwoch Montag Samstag Sonntag
# (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
#1 1 0 1 1 1 0 0 2
#2 2 2 1 1 1 1 1 1
#3 3 1 0 2 1 2 0 1
My results are in German, but yours would be in your own language of course. The column names are German weekdays.
If you want to use summarize
explicitly you can achieve the same as above using:
summa <- df %>%
group_by(id, day = weekdays(dates)) %>%
summarize(n = n()) # or do something with summarise_each() for many columns
Upvotes: 4
Reputation: 131
You could use the lubridate package:
library(lubridate)
summa <- df %>% group_by(id) %>%
summarize(mondays = sum(wday(dates) == 2),
....
Upvotes: 3