Reputation: 93
I want to make a forecast project from a time series dataframe. but, the time span is too big. then, I have this column in dataframe from a time series data frame.
Date
2010-06-29
2010-06-30
2010-07-01
2010-07-02
how can I change it so that it only shows every 7 days?
Date
2010-06-29
2010-07-05
2010-07-12
2010-07-19
etc
Upvotes: 0
Views: 326
Reputation: 9858
Daniel's answer is very simple and direct. However, it will return only data from a specified weekday, which could lead to biased results depending on the nature of your data.
You can create an index of weekdays that is balanced with random sampling of weekdays:
# example data
df <- data.frame(date = seq.Date(from = ymd("2021/01/01"),
to = ymd("2021/12/31"),
by = "day"))
#create index by sampling weekdays randomly
set.seed(1)
index<-replicate(floor(nrow(df)/7), {sample(unique(df$weekday), replace = FALSE)}) %>%
as.vector()
#subsetting to a 7-fold smaller dataset
library(dplyr)
output<-df %>% filter(weekdays(date)==index)
#checking table of weekdays in the final dataset
table(output$weekday)
Friday Monday Saturday Sunday Thursday Tuesday Wednesday
13 6 5 9 8 10 6
Upvotes: 0
Reputation: 441
dataframe.new = dataframe[seq(1, nrow(dataframe), 7),]
seq documentation - https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/seq
basically, seq(1, 100, 7) will generate - 1, 8, 15, ...
Upvotes: 1