nightstand
nightstand

Reputation: 783

How to pivot longer or add rows to a dataframe based on date ranges?

I have this data frame with id, id_month_year, start_date, end_date, daylight_begin, and daylight_end, where id is the ID for the record, id_month_year is just a string of the record's ID, month and year. start_date and end_date are start and end dates of the record. daylight_begin and daylight_end are the beginning and ending dates of daylight savings time.

df <- data.frame(
  id = c(rep("A01", 5), rep("B01", 4)),
  id_month_year = c("A01 Jan 2023", "A01 March 2023", "A01 November 2022", "A01 June 2022",
                    "A01 March 2022", "B02 March 2022", "B02 November 2022", "B02 March 2023", "B02 May 2022"),
  start_date = c("2023-01-04", "2023-03-01", "2022-11-01", "2022-06-05",
                 "2022-03-02", "2022-03-04", "2022-11-05", "2023-03-02", "2022-05-03"),
  end_date = c("2023-01-25", "2023-03-24", "2022-11-27", "2022-06-24",
               "2022-03-29", "2022-03-16", "2022-11-28", "2023-03-30", "2022-05-30")
) %>%
  mutate(across(c(start_date, end_date), ~as.Date(.x)),
         daylight_begin = if_else(year(start_date) == "2022", as_date("2022-03-13"), as_date("2023-03-12")),
         daylight_end = if_else(year(end_date) == "2022", as_date("2022-11-06"), as_date("2023-11-05"))

  )

For the start and end dates that are within the daylight savings date ranges, I want to break the date interval using the daylight savings dates as cutoff dates. i.e. if either daylight_begin or daylight_end dates are within the original start_date and end_date range, then I want to break the date range using the appropriate daylight_begin or daylight_end as a cutoff point.

The two years have different daylight_begin and daylight_end dates:

For example, for id_month_year of "A01 March 2023," the start date is "2023-03-01" and the end date is "2023-03-24". Since this date range includes daylight_begin of "2023-03-12", the new date ranges for "A01 March 2023" would be from:

I also want to duplicate the id and id_month_year for the row that needs to be broken down. So essentially I want to change the row from this

enter image description here

to that.

enter image description here

After breaking all the date ranges by daylight savings dates, I want my data frame to look like this:

new_df <- data.frame(
  id = c(rep("A01", 8), rep("B01", 7)),
  id_month_year = c("A01 Jan 2023", rep("A01 March 2023", 2), rep("A01 November 2022", 2), "A01 June 2022",
                    rep("A01 March 2022",2), rep("B02 March 2022",2), rep("B02 November 2022",2), rep("B02 March 2023",2), "B02 May 2022"),
  start_date = c("2023-01-04", "2023-03-01", "2023-03-12", "2022-11-01",  "2022-11-06", "2022-06-05",
                 "2022-03-02", "2022-03-13", "2022-03-04", "2022-03-13", "2022-11-05", "2022-11-06", "2023-03-02", "2023-03-12","2022-05-03"),
  end_date = c("2023-01-25","2023-03-11", "2023-03-24", "2022-11-05", "2022-11-27", "2022-06-24",
               "2022-03-12", "2022-03-29", "2022-03-12", "2022-03-16", "2022-11-05", "2022-11-28", "2023-03-11", "2023-03-30", "2022-05-30")
) %>%
  mutate(across(c(start_date, end_date), ~as.Date(.x)),
    daylight_begin = if_else(year(start_date) == "2022", as_date("2022-03-13"), as_date("2023-03-12")),
         daylight_end = if_else(year(end_date) == "2022", as_date("2022-11-06"), as_date("2023-11-05"))
    )
> new_df
    id     id_month_year start_date   end_date daylight_begin daylight_end
1  A01      A01 Jan 2023 2023-01-04 2023-01-25     2023-03-12   2023-11-05
2  A01    A01 March 2023 2023-03-01 2023-03-11     2023-03-12   2023-11-05
3  A01    A01 March 2023 2023-03-12 2023-03-24     2023-03-12   2023-11-05
4  A01 A01 November 2022 2022-11-01 2022-11-05     2022-03-13   2022-11-06
5  A01 A01 November 2022 2022-11-06 2022-11-27     2022-03-13   2022-11-06
6  A01     A01 June 2022 2022-06-05 2022-06-24     2022-03-13   2022-11-06
7  A01    A01 March 2022 2022-03-02 2022-03-12     2022-03-13   2022-11-06
8  A01    A01 March 2022 2022-03-13 2022-03-29     2022-03-13   2022-11-06
9  B01    B02 March 2022 2022-03-04 2022-03-12     2022-03-13   2022-11-06
10 B01    B02 March 2022 2022-03-13 2022-03-16     2022-03-13   2022-11-06
11 B01 B02 November 2022 2022-11-05 2022-11-05     2022-03-13   2022-11-06
12 B01 B02 November 2022 2022-11-06 2022-11-28     2022-03-13   2022-11-06
13 B01    B02 March 2023 2023-03-02 2023-03-11     2023-03-12   2023-11-05
14 B01    B02 March 2023 2023-03-12 2023-03-30     2023-03-12   2023-11-05
15 B01      B02 May 2022 2022-05-03 2022-05-30     2022-03-13   2022-11-06

Upvotes: 0

Views: 52

Answers (1)

r2evans
r2evans

Reputation: 160607

Here's a two-step:

library(dplyr)
df <- df %>%
  mutate(
    split_left = start_date < daylight_begin & between(end_date, daylight_begin, daylight_end),
    split_right = between(start_date, daylight_begin, daylight_end) & end_date > daylight_end
  )
bind_rows(
  filter(df, !split_left, !split_right),
  filter(df, split_right) %>% mutate(end_date = daylight_end - 1),
  filter(df, split_right) %>% mutate(start_date = daylight_end),
  filter(df, split_left) %>% mutate(start_date = daylight_begin),
  filter(df, split_left) %>% mutate(end_date = daylight_begin - 1)
) %>%
  select(-split_left, -split_right)
#     id     id_month_year start_date   end_date daylight_begin daylight_end
# 1  A01      A01 Jan 2023 2023-01-04 2023-01-25     2023-03-12   2023-11-05
# 2  A01     A01 June 2022 2022-06-05 2022-06-24     2022-03-13   2022-11-06
# 3  B01      B02 May 2022 2022-05-03 2022-05-30     2022-03-13   2022-11-06
# 4  A01 A01 November 2022 2022-11-01 2022-11-05     2022-03-13   2022-11-06
# 5  B01 B02 November 2022 2022-11-05 2022-11-05     2022-03-13   2022-11-06
# 6  A01 A01 November 2022 2022-11-06 2022-11-27     2022-03-13   2022-11-06
# 7  B01 B02 November 2022 2022-11-06 2022-11-28     2022-03-13   2022-11-06
# 8  A01    A01 March 2023 2023-03-12 2023-03-24     2023-03-12   2023-11-05
# 9  A01    A01 March 2022 2022-03-13 2022-03-29     2022-03-13   2022-11-06
# 10 B01    B02 March 2022 2022-03-13 2022-03-16     2022-03-13   2022-11-06
# 11 B01    B02 March 2023 2023-03-12 2023-03-30     2023-03-12   2023-11-05
# 12 A01    A01 March 2023 2023-03-01 2023-03-11     2023-03-12   2023-11-05
# 13 A01    A01 March 2022 2022-03-02 2022-03-12     2022-03-13   2022-11-06
# 14 B01    B02 March 2022 2022-03-04 2022-03-12     2022-03-13   2022-11-06
# 15 B01    B02 March 2023 2023-03-02 2023-03-11     2023-03-12   2023-11-05

Upvotes: 1

Related Questions