cmirian
cmirian

Reputation: 2253

How to calculate week number based on consecutive weeks spanning four years

Let's say I have a set of dates stored in df$diag:

> df
        diag
1 2017-01-02
2 2017-01-02
3 2017-01-05
4 2017-01-06
5 2017-01-09
6 2017-01-18

I want to make a new covariate df$week that count consecutive weeks since monday 2017-01-02 as in yyyy-mm-dd, where 2017-01-02 correspond to df$week==1. For every monday that passes between 2017-01-02 and 2020-12-31, the df$week should increase by 1.

> df
        diag  week
1 2017-01-02     1
2 2017-01-02     1
3 2017-01-05     1
4 2017-01-06     1
5 2017-01-09     2
6 2017-01-18     3

I tried

mutate(df, week = 1 + round(as.numeric(difftime(strptime(diag, format = "%Y-%m-%d"),
                                                strptime("02.01.2017", format = "%d.%m.%Y"),
                                                units="weeks"), digits = 0)))

Which gives

        diag week
1 2017-01-02    1
2 2017-01-02    1
3 2017-01-05    1
4 2017-01-06    2
5 2017-01-09    2
6 2017-01-18    3

However, friday the 2017-01-06 should correctly belong to df$week==1, while monday the 2017-01-09 should proceed to df$week==2.

I am looking for a solution in dplyr.

Data

df <- structure(list(diag = c("2017-01-02", "2017-01-02", "2017-01-05", 
"2017-01-06", "2017-01-09", "2017-01-18")), class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 0

Views: 253

Answers (2)

AnilGoyal
AnilGoyal

Reputation: 26238

How about this (assuming first day in diag is a monday)

df %>% mutate(diag = as.Date(diag),
              weekno = as.numeric(diag - first(diag)),
              weekno = (weekno %/% 7) +1)
        diag weekno
1 2017-01-02      1
2 2017-01-02      1
3 2017-01-05      1
4 2017-01-06      1
5 2017-01-09      2
6 2017-01-18      3

Upvotes: 2

Sinh Nguyen
Sinh Nguyen

Reputation: 4497

Here is an approach using dplyr::mutate with baseR

library(dplyr)
df %>%
  mutate(
    day_since_2017_01_02 = as.integer(difftime(diag, as.Date("2017-01-02"),
      units = "days")),
    week_index = (day_since_2017_01_02 + 1) %/% 7 + 1
  )

Output

        diag day_since_2017_01_02 week_index
1 2017-01-02                    0          1
2 2017-01-02                    0          1
3 2017-01-05                    2          1
4 2017-01-06                    3          1
5 2017-01-09                    6          2
6 2017-01-18                   15          3

Upvotes: 1

Related Questions