Reputation: 37
I want to automatically create multiple Dataframes based on an interval of Dates of another Dataframe. Let's say I have this example:
df <- data.frame(Date = as.Date(c("2022-01-01", "2022-01-01",
"2022-01-02", "2022-01-02", "2022-01-02",
"2022-01-03",
"2022-01-04", "2022-01-04",
"2022-01-05", "2022-01-05", "2022-01-05")),
Name = c(LETTERS[1:11]),
Value = c(1:11))
My goal is to create 3 new Dataframes. df1
should contain the data from 2022-01-01
to 2022-01-04
, df2
should contain the data from 2022-01-02
to 2022-01-05
, and df3
should contain the data from 2022-01-03
to 2022-01-06
. With that, this is the desired Output, with all the objects being as dataframes:
df1 <- data.frame(Date = as.Date(c("2022-01-01", "2022-01-01",
"2022-01-02", "2022-01-02", "2022-01-02",
"2022-01-03")),
Name = c(LETTERS[1:6]),
Value = c(1:6))
df2 <- data.frame(Date = as.Date(c("2022-01-02", "2022-01-02", "2022-01-02",
"2022-01-03",
"2022-01-04", "2022-01-04")),
Name = c(LETTERS[3:8]),
Value = c(3:8))
df3 <- data.frame(Date = as.Date(c("2022-01-03",
"2022-01-04", "2022-01-04",
"2022-01-05", "2022-01-05", "2022-01-05")),
Name = c(LETTERS[6:11]),
Value = c(6:11))
Notice that the number of observations from each date is different. My actual Dataframe is much bigger than the example and it will keep increasing each day, so I need to make this process automatic. Any sugestions?
Upvotes: 0
Views: 213
Reputation: 160447
Here's an alternative:
dates <- seq(df$Date[1], df$Date[1]+3, by = "day")
dates
# [1] "2022-01-01" "2022-01-02" "2022-01-03" "2022-01-04"
Map(function(a, b) dplyr::filter(df, between(Date, a, b)), dates, dates + 3)
# [[1]]
# Date Name Value
# 1 2022-01-01 A 1
# 2 2022-01-01 B 2
# 3 2022-01-02 C 3
# 4 2022-01-02 D 4
# 5 2022-01-02 E 5
# 6 2022-01-03 F 6
# 7 2022-01-04 G 7
# 8 2022-01-04 H 8
# [[2]]
# Date Name Value
# 1 2022-01-02 C 3
# 2 2022-01-02 D 4
# 3 2022-01-02 E 5
# 4 2022-01-03 F 6
# 5 2022-01-04 G 7
# 6 2022-01-04 H 8
# 7 2022-01-05 I 9
# 8 2022-01-05 J 10
# 9 2022-01-05 K 11
# [[3]]
# Date Name Value
# 1 2022-01-03 F 6
# 2 2022-01-04 G 7
# 3 2022-01-04 H 8
# 4 2022-01-05 I 9
# 5 2022-01-05 J 10
# 6 2022-01-05 K 11
# [[4]]
# Date Name Value
# 1 2022-01-04 G 7
# 2 2022-01-04 H 8
# 3 2022-01-05 I 9
# 4 2022-01-05 J 10
# 5 2022-01-05 K 11
Granted, this made four instead of three, but that can easily be controlled by the assignment to dates
.
This produces a list
of frames, not three independent frames. I think you'll find that when you have multiple identically-structured (column names/intents) frames, it's best to keep them in a list, that way when you intend to do something to each of them, you can easily use lapply
. See https://stackoverflow.com/a/24376207/3358227 for more discussion on this.
Upvotes: 1