Reputation: 2253
Let's say I have a set of dates stored in df$diag
:
> df
diag
1 2017-01-02
2 2017-01-02
3 2017-01-05
4 2017-01-06
5 2017-01-09
6 2017-01-18
I want to make a new covariate df$week
that count consecutive weeks since monday 2017-01-02
as in yyyy-mm-dd
, where 2017-01-02
correspond to df$week==1
. For every monday that passes between 2017-01-02
and 2020-12-31
, the df$week
should increase by 1.
> df
diag week
1 2017-01-02 1
2 2017-01-02 1
3 2017-01-05 1
4 2017-01-06 1
5 2017-01-09 2
6 2017-01-18 3
I tried
mutate(df, week = 1 + round(as.numeric(difftime(strptime(diag, format = "%Y-%m-%d"),
strptime("02.01.2017", format = "%d.%m.%Y"),
units="weeks"), digits = 0)))
Which gives
diag week
1 2017-01-02 1
2 2017-01-02 1
3 2017-01-05 1
4 2017-01-06 2
5 2017-01-09 2
6 2017-01-18 3
However, friday the 2017-01-06
should correctly belong to df$week==1
, while monday the 2017-01-09
should proceed to df$week==2
.
I am looking for a solution in dplyr
.
Data
df <- structure(list(diag = c("2017-01-02", "2017-01-02", "2017-01-05",
"2017-01-06", "2017-01-09", "2017-01-18")), class = "data.frame", row.names = c(NA,
-6L))
Upvotes: 0
Views: 253
Reputation: 26238
How about this (assuming first day in diag is a monday)
df %>% mutate(diag = as.Date(diag),
weekno = as.numeric(diag - first(diag)),
weekno = (weekno %/% 7) +1)
diag weekno
1 2017-01-02 1
2 2017-01-02 1
3 2017-01-05 1
4 2017-01-06 1
5 2017-01-09 2
6 2017-01-18 3
Upvotes: 2
Reputation: 4497
Here is an approach using dplyr::mutate
with baseR
library(dplyr)
df %>%
mutate(
day_since_2017_01_02 = as.integer(difftime(diag, as.Date("2017-01-02"),
units = "days")),
week_index = (day_since_2017_01_02 + 1) %/% 7 + 1
)
Output
diag day_since_2017_01_02 week_index
1 2017-01-02 0 1
2 2017-01-02 0 1
3 2017-01-05 2 1
4 2017-01-06 3 1
5 2017-01-09 6 2
6 2017-01-18 15 3
Upvotes: 1