Create multiple Dataframes based on the Dates of another Dataframe in R

I want to automatically create multiple Dataframes based on an interval of Dates of another Dataframe. Let's say I have this example:

df <- data.frame(Date = as.Date(c("2022-01-01", "2022-01-01", 
                                  "2022-01-02", "2022-01-02", "2022-01-02", 
                                  "2022-01-03", 
                                  "2022-01-04", "2022-01-04", 
                                  "2022-01-05", "2022-01-05", "2022-01-05")),
                 Name = c(LETTERS[1:11]),
                 Value = c(1:11))

My goal is to create 3 new Dataframes. df1 should contain the data from 2022-01-01 to 2022-01-04, df2 should contain the data from 2022-01-02 to 2022-01-05, and df3 should contain the data from 2022-01-03 to 2022-01-06. With that, this is the desired Output, with all the objects being as dataframes:

df1 <- data.frame(Date = as.Date(c("2022-01-01", "2022-01-01", 
                                  "2022-01-02", "2022-01-02", "2022-01-02", 
                                  "2022-01-03")),
                 Name = c(LETTERS[1:6]),
                 Value = c(1:6))

df2 <- data.frame(Date = as.Date(c("2022-01-02", "2022-01-02", "2022-01-02", 
                                   "2022-01-03", 
                                   "2022-01-04", "2022-01-04")),
                  Name = c(LETTERS[3:8]),
                  Value = c(3:8))

df3 <- data.frame(Date = as.Date(c("2022-01-03", 
                                   "2022-01-04", "2022-01-04", 
                                   "2022-01-05", "2022-01-05", "2022-01-05")),
                  Name = c(LETTERS[6:11]),
                  Value = c(6:11))

Notice that the number of observations from each date is different. My actual Dataframe is much bigger than the example and it will keep increasing each day, so I need to make this process automatic. Any sugestions?

Upvotes: 0

Views: 213

Answers (1)

r2evans
r2evans

Reputation: 160447

Here's an alternative:

dates <- seq(df$Date[1], df$Date[1]+3, by = "day")
dates
# [1] "2022-01-01" "2022-01-02" "2022-01-03" "2022-01-04"
Map(function(a, b) dplyr::filter(df, between(Date, a, b)), dates, dates + 3)
# [[1]]
#         Date Name Value
# 1 2022-01-01    A     1
# 2 2022-01-01    B     2
# 3 2022-01-02    C     3
# 4 2022-01-02    D     4
# 5 2022-01-02    E     5
# 6 2022-01-03    F     6
# 7 2022-01-04    G     7
# 8 2022-01-04    H     8
# [[2]]
#         Date Name Value
# 1 2022-01-02    C     3
# 2 2022-01-02    D     4
# 3 2022-01-02    E     5
# 4 2022-01-03    F     6
# 5 2022-01-04    G     7
# 6 2022-01-04    H     8
# 7 2022-01-05    I     9
# 8 2022-01-05    J    10
# 9 2022-01-05    K    11
# [[3]]
#         Date Name Value
# 1 2022-01-03    F     6
# 2 2022-01-04    G     7
# 3 2022-01-04    H     8
# 4 2022-01-05    I     9
# 5 2022-01-05    J    10
# 6 2022-01-05    K    11
# [[4]]
#         Date Name Value
# 1 2022-01-04    G     7
# 2 2022-01-04    H     8
# 3 2022-01-05    I     9
# 4 2022-01-05    J    10
# 5 2022-01-05    K    11

Granted, this made four instead of three, but that can easily be controlled by the assignment to dates.

This produces a list of frames, not three independent frames. I think you'll find that when you have multiple identically-structured (column names/intents) frames, it's best to keep them in a list, that way when you intend to do something to each of them, you can easily use lapply. See https://stackoverflow.com/a/24376207/3358227 for more discussion on this.

Upvotes: 1

Related Questions