Reputation: 139
I am trying to do a count of rows that fall on and between two dates (minimum and maximum) per group. The only caveat is each group has a different pair of dates. See example below.
This is my raw dataset.
raw <- data.frame ("Group" = c("A", "B", "A", "A", "B"), "Date" = c("2017-01-01", "2017-02-02", "2017-09-01", "2017-12-31", "2017-05-09"))
I would like it to return this...
clean <- data.frame ("Group" = c("A", "B"), "Min" = c("2017-01-01", "2017-02-02"), "Max" = c("2017-12-31", "2017-05-09"), "Count" = c(3, 2))
How would I be able to do this? The mix and max variable are not crucial, but definitely would like to know how to do the count variable. Thank you!
Upvotes: 1
Views: 830
Reputation: 1563
The date range is given or you want to calculate it from data as well. If later is true then this should do it.
require(tidyverse)
raw %>%
mutate(Date = as.Date(Date)) %>%
group_by(Group) %>%
summarise(min_date = min(Date), max_date = max(Date), count = n())
Output:
# A tibble: 2 x 4
Group min_date max_date count
<fct> <date> <date> <int>
1 A 2017-01-01 2017-12-31 3
2 B 2017-02-02 2017-05-09 2
Upvotes: 1