Reputation: 2680
I have some data that looks like this:
# order_date quantity
# 1 2021-01-01 54
# 2 2021-01-01 32
# 3 2021-01-02 42
# 4 2021-01-01 132
# 5 2021-01-01 56
# 6 2021-01-02 88
# 7 2021-01-08 99
# 8 2021-01-10 54
When I use the following code:
df$week <- cut(as.Date(df$order_date), breaks="week")
I get the following:
# order_date quantity week
# 1 2021-01-01 54 2020-12-28
# 2 2021-01-01 32 2020-12-28
# 3 2021-01-02 42 2020-12-28
# 4 2021-01-01 132 2020-12-28
# 5 2021-01-01 56 2020-12-28
# 6 2021-01-02 88 2020-12-28
# 7 2021-01-08 99 2021-01-04
# 8 2021-01-10 54 2021-01-04
Since my data starts on 1/1/21 I would like the week grouping to start on 1/1/21 and not 12/28/2020 (The nearest Sunday). So my groups would look like this:
# order_date quantity week
# 1 2021-01-01 54 2021-01-01
# 2 2021-01-01 32 2021-01-01
# 3 2021-01-02 42 2021-01-01
# 4 2021-01-01 132 2021-01-01
# 5 2021-01-01 56 2021-01-01
# 6 2021-01-02 88 2021-01-01
# 7 2021-01-08 99 2021-01-07
# 8 2021-01-10 54 2021-01-07
open to other libraries / syntax.
Upvotes: 0
Views: 545
Reputation: 1253
An approach using my package timeplyr
which always uses the start date to build sequences unless specified otherwise.
time_summarisev()
internally uses findInterval()
.
# remotes::install_github("NicChr/timeplyr")
library(timeplyr)
dat$week <- time_summarisev(dat$order_date, by = "week",
unique = FALSE, sort = FALSE)
dat
#> order_date quantity week
#> 1 2021-01-01 54 2021-01-01
#> 2 2021-01-01 32 2021-01-01
#> 3 2021-01-01 42 2021-01-01
#> 4 2021-01-01 132 2021-01-01
#> 5 2021-01-01 56 2021-01-01
#> 6 2021-01-02 88 2021-01-01
#> 7 2021-01-03 99 2021-01-01
#> 8 2021-01-03 54 2021-01-01
#> 9 2021-01-02 23 2021-01-01
#> 10 2021-01-10 11 2021-01-08
Multi-unit week aggregations are also supported.
dat$week2 <- time_summarisev(dat$order_date, by = "2 weeks",
unique = FALSE, sort = FALSE)
dat$Week2 <- lubridate::floor_date(dat$order_date, "2 weeks", week_start = 5)
#> Error in validate_rounding_nunit(.Call(C_parse_unit, as.character(unit))): Rounding with week > 1 is not supported. Use aseconds for arbitrary units.
dat
#> order_date quantity week week2
#> 1 2021-01-01 54 2021-01-01 2021-01-01
#> 2 2021-01-01 32 2021-01-01 2021-01-01
#> 3 2021-01-01 42 2021-01-01 2021-01-01
#> 4 2021-01-01 132 2021-01-01 2021-01-01
#> 5 2021-01-01 56 2021-01-01 2021-01-01
#> 6 2021-01-02 88 2021-01-01 2021-01-01
#> 7 2021-01-03 99 2021-01-01 2021-01-01
#> 8 2021-01-03 54 2021-01-01 2021-01-01
#> 9 2021-01-02 23 2021-01-01 2021-01-01
#> 10 2021-01-10 11 2021-01-08 2021-01-01
Upvotes: 0
Reputation: 52399
You can manually set the first day of the week using lubridate::floor_date
.
dat$Week <- lubridate::floor_date(dat$order_date, "weeks", week_start = 5)
> dat
# order_date quantity week
#1 2021-01-01 54 2021-01-01
#2 2021-01-01 32 2021-01-01
#3 2021-01-01 42 2021-01-01
#4 2021-01-01 132 2021-01-01
#5 2021-01-01 56 2021-01-01
#6 2021-01-02 88 2021-01-01
#7 2021-01-03 99 2021-01-01
#8 2021-01-03 54 2021-01-01
#9 2021-01-02 23 2021-01-01
#10 2021-01-10 11 2021-01-08
Data
order_date <- c("2021-01-01", "2021-01-01","2021-01-01","2021-01-01","2021-01-01","2021-01-02","2021-01-03","2021-01-03","2021-01-02","2021-01-10")
quantity <- c(54,32,42,132,56,88,99,54,23,11)
dat <- data.frame(order_date=as.Date(order_date), quantity)
Upvotes: 1
Reputation: 73802
You may use seq.Dat
on the date range plus one week. No packages needed.
dat |>
transform(week=cut(order_date,
breaks=seq.Date(min(order_date), max(order_date) + 7,
by='week')))
# order_date quantity week
# 1 2021-01-01 54 2021-01-01
# 2 2021-01-01 32 2021-01-01
# 3 2021-01-01 42 2021-01-01
# 4 2021-01-01 132 2021-01-01
# 5 2021-01-01 56 2021-01-01
# 6 2021-01-02 88 2021-01-01
# 7 2021-01-03 99 2021-01-01
# 8 2021-01-03 54 2021-01-01
# 9 2021-01-08 23 2021-01-08
# 10 2021-01-10 11 2021-01-08
Note: R >= 4.1 used.
Data:
dat <- structure(list(order_date = structure(c(18628, 18628, 18628,
18628, 18628, 18629, 18630, 18630, 18635, 18637), class = "Date"),
quantity = c(54, 32, 42, 132, 56, 88, 99, 54, 23, 11)), class = "data.frame", row.names = c(NA,
-10L))
Upvotes: 1