cdcarrion
cdcarrion

Reputation: 590

Summarize Dates

I want to summarize dates from Monday to Monday but on Saturdays and Sundays accumulate on Friday

## A tibble: 10 x 2
#date        value
#<date>      <dbl>
#  1 2020-01-02  1     --- Thurs
#2 2020-01-03 1        --- Fri
#3 2020-01-04  2       --- Sat
#4 2020-01-06  1       --- Mon
#5 2020-01-07  1       --- Tues
#6 2020-01-08 1        --- Wed
#7 2020-01-09  1       --- Thurs
#8 2020-01-10  1       --- Fri
#9 2020-01-11  2       --- Sat
#10 2020-01-13 1       --- Mon

This is what i want

#date        value
#<date>      <dbl>
#  1 2020-01-02  1     --- Thurs
#2 2020-01-03 3        --- Fri
#4 2020-01-06  1       --- Mon
#5 2020-01-07  1       --- Tues
#6 2020-01-08 1        --- Wed
#7 2020-01-09  1       --- Thurs
#8 2020-01-10  3       --- Fri
#10 2020-01-13 1       --- Mon

Data

dd = structure(list(date = structure(c(18263, 18264, 18265, 18267, 18268, 18269, 18270, 18271, 18272, 18274), class = "Date"), value = c(1, 1, 2, 1, 1, 1, 1, 1, 2, 1)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))

Upvotes: 1

Views: 99

Answers (1)

akrun
akrun

Reputation: 887871

We can use the weekdays to replace the 'date' that are on weekends to NA, then fill the NAs with previous non-NA element (assuming the 'date' is already arranged, and then do a group_by sum

library(dplyr)
library(tidyr)
dd %>% 
    mutate(Weekday = weekdays(date),
          date = replace(date, Weekday %in% c("Saturday", "Sunday"), NA)) %>% 
    fill(date) %>% 
    group_by(date) %>% 
    summarise(value = sum(value), .groups = 'drop')

-output

# A tibble: 8 x 2
#  date       value
#  <date>     <dbl>
#1 2020-01-02     1
#2 2020-01-03     3
#3 2020-01-06     1
#4 2020-01-07     1
#5 2020-01-08     1
#6 2020-01-09     1
#7 2020-01-10     3
#8 2020-01-13     1

Upvotes: 1

Related Questions