John Huang
John Huang

Reputation: 845

Filtering by a specific number of days using lubridate

I have a data set that I would like to separate by 10-day intervals. For example, I would like to get all of the dates 26-12-2010 to 04-01-2011 for ID 1 together than the next 10-days for ID 1 together. I would like to do this for each ID, and compile the 10-day intervals into a list.

library(lubridate)
date <- rep_len(seq(dmy("26-12-2010"), dmy("20-12-2013"), by = "days"), 500)
ID <- rep(seq(1, 5), 100)

df <- data.frame(date = date,
                 x = runif(length(date), min = 60000, max = 80000),
                 y = runif(length(date), min = 800000, max = 900000),
                 ID)

df %>% 
    mutate(interval = map(1:50, ~rep(.x, 10)) %>% reduce(c)) %>% 
    group_split(interval) %>%
    map(~arrange(.x, ID)) %>% 
    map(~ group_split(.x, ID)) %>% 
    head(2)
)

When using the last lines of code, it breaks the days and IDs but the observations that are suppose to be within 10-days are not being grouped together.

Upvotes: 2

Views: 249

Answers (1)

Anoushiravan R
Anoushiravan R

Reputation: 21908

I've had difficulty understanding your desired output yesterday, but I have no idea why you don't start by arranging all IDs first. I hope this is what you are looking for:

library(dplyr)
library(magrittr)

# slicing first 2 elements only
df %>%
  arrange(ID) %>% 
  mutate(cut = data.table::rleid(cut(date, breaks = "10 day"))) %>% 
  group_split(ID, cut) %>%
  extract(1:2)

[[1]]
# A tibble: 2 x 5
  date            x       y    ID   cut
  <date>      <dbl>   <dbl> <int> <int>
1 2010-12-26 73719. 803002.     1     1
2 2010-12-31 66825. 870527.     1     1

[[2]]
# A tibble: 2 x 5
  date            x       y    ID   cut
  <date>      <dbl>   <dbl> <int> <int>
1 2011-01-05 63023. 807545.     1     2
2 2011-01-10 76356. 875837.     1     2

Upvotes: 1

Related Questions