Gab_Laj
Gab_Laj

Reputation: 129

Count the number of rows a month before a date

I would like to add a column to my dataframe which would calculate the number of rows per park a month before each date.

My dataframe looks like this:

> Reaction_per_park_per_day_3

Park                            Date       Type_1_2 Coy_season AC_code  Year
  <chr>                           <date>     <chr>         <dbl> <chr>   <dbl>
1 Airways Park                    2019-01-14 1                 1 6        2019              
2 Airways Park                    2019-01-16 0                 1 4        2019              
3 Airways Park                    2019-01-24 0                 1 4        2019              
4 Bayview                         2020-07-05 0                 3 3        2020              
5 Bayview                         2020-09-27 0                 3 3        2020

The resulting column would look like this:

Park                            Date       Type_1_2 Coy_season AC_code  Year    AC_Past_month
  <chr>                           <date>     <chr>         <dbl> <chr>   <dbl>
1 Airways Park                    2019-01-14 1                 1 6        2019              0            
2 Airways Park                    2019-01-16 0                 1 4        2019              1
3 Airways Park                    2019-01-24 0                 1 4        2019              2
4 Bayview                         2020-07-05 0                 3 3        2020              0
5 Bayview                         2020-09-27 0                 3 3        2020              0

I am able to calculate the total prior number of rows using this code, but don't seem to be able to figure out how to restrict it to only the past month.

Reaction_per_park_per_day_3 <- Reaction_per_park_per_day_2 %>%
  group_by(Park) %>%
  mutate(Total_prior_AC = row_number() -1)

I tried to restrict it in time using the following code, but it did not work at all (the code does not even run).

Reaction_per_park_per_day_3 <- Reaction_per_park_per_day_3 %>%
  group_by(Park, Date, Type_1_2, Coy_season, AC_code, Year) %>%
  summarise(Number_4w_AC = sum(Date - weeks(4)), .groups = 'drop')

I would prefer dplyr solutions if possible, but I am open to any suggestions!

Upvotes: 3

Views: 251

Answers (1)

r2evans
r2evans

Reputation: 160447

library(dplyr)
Reaction_per_park_per_day_3 %>%
  group_by(Park) %>%
  mutate(
    AA = sapply(Date, function(d) sum(between(as.numeric(difftime(d, Date[Date < d], units = "days")), 0, 31)))
  ) %>%
  ungroup()
# # A tibble: 5 x 8
#   Park         Date       Type_1_2 Coy_season AC_code  Year AC_Past_month    AA
#   <chr>        <chr>         <int>      <int>   <int> <int>         <int> <int>
# 1 Airways Park 2019-01-14        1          1       6  2019             0     0
# 2 Airways Park 2019-01-16        0          1       4  2019             1     1
# 3 Airways Park 2019-01-24        0          1       4  2019             2     2
# 4 Bayview      2020-07-05        0          3       3  2020             0     0
# 5 Bayview      2020-09-27        0          3       3  2020             0     0

Data

Reaction_per_park_per_day_3 <- structure(list(Park = c("Airways Park", "Airways Park", "Airways Park", "Bayview", "Bayview"), Date = c("2019-01-14", "2019-01-16", "2019-01-24", "2020-07-05", "2020-09-27"), Type_1_2 = c(1L, 0L, 0L, 0L, 0L), Coy_season = c(1L, 1L, 1L, 3L, 3L), AC_code = c(6L, 4L, 4L, 3L, 3L), Year = c(2019L, 2019L, 2019L, 2020L, 2020L), AC_Past_month = c(0L, 1L, 2L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5"))

Upvotes: 1

Related Questions