Reputation: 129
I would like to add a column to my dataframe which would calculate the number of rows per park a month before each date.
My dataframe looks like this:
> Reaction_per_park_per_day_3
Park Date Type_1_2 Coy_season AC_code Year
<chr> <date> <chr> <dbl> <chr> <dbl>
1 Airways Park 2019-01-14 1 1 6 2019
2 Airways Park 2019-01-16 0 1 4 2019
3 Airways Park 2019-01-24 0 1 4 2019
4 Bayview 2020-07-05 0 3 3 2020
5 Bayview 2020-09-27 0 3 3 2020
The resulting column would look like this:
Park Date Type_1_2 Coy_season AC_code Year AC_Past_month
<chr> <date> <chr> <dbl> <chr> <dbl>
1 Airways Park 2019-01-14 1 1 6 2019 0
2 Airways Park 2019-01-16 0 1 4 2019 1
3 Airways Park 2019-01-24 0 1 4 2019 2
4 Bayview 2020-07-05 0 3 3 2020 0
5 Bayview 2020-09-27 0 3 3 2020 0
I am able to calculate the total prior number of rows using this code, but don't seem to be able to figure out how to restrict it to only the past month.
Reaction_per_park_per_day_3 <- Reaction_per_park_per_day_2 %>%
group_by(Park) %>%
mutate(Total_prior_AC = row_number() -1)
I tried to restrict it in time using the following code, but it did not work at all (the code does not even run).
Reaction_per_park_per_day_3 <- Reaction_per_park_per_day_3 %>%
group_by(Park, Date, Type_1_2, Coy_season, AC_code, Year) %>%
summarise(Number_4w_AC = sum(Date - weeks(4)), .groups = 'drop')
I would prefer dplyr solutions if possible, but I am open to any suggestions!
Upvotes: 3
Views: 251
Reputation: 160447
library(dplyr)
Reaction_per_park_per_day_3 %>%
group_by(Park) %>%
mutate(
AA = sapply(Date, function(d) sum(between(as.numeric(difftime(d, Date[Date < d], units = "days")), 0, 31)))
) %>%
ungroup()
# # A tibble: 5 x 8
# Park Date Type_1_2 Coy_season AC_code Year AC_Past_month AA
# <chr> <chr> <int> <int> <int> <int> <int> <int>
# 1 Airways Park 2019-01-14 1 1 6 2019 0 0
# 2 Airways Park 2019-01-16 0 1 4 2019 1 1
# 3 Airways Park 2019-01-24 0 1 4 2019 2 2
# 4 Bayview 2020-07-05 0 3 3 2020 0 0
# 5 Bayview 2020-09-27 0 3 3 2020 0 0
Data
Reaction_per_park_per_day_3 <- structure(list(Park = c("Airways Park", "Airways Park", "Airways Park", "Bayview", "Bayview"), Date = c("2019-01-14", "2019-01-16", "2019-01-24", "2020-07-05", "2020-09-27"), Type_1_2 = c(1L, 0L, 0L, 0L, 0L), Coy_season = c(1L, 1L, 1L, 3L, 3L), AC_code = c(6L, 4L, 4L, 3L, 3L), Year = c(2019L, 2019L, 2019L, 2020L, 2020L), AC_Past_month = c(0L, 1L, 2L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5"))
Upvotes: 1