Corey Pembleton
Corey Pembleton

Reputation: 757

Add sequence of week count aligned to a date column with infrequent dates

I'm building a dataset and am looking to be able to add a week count to a dataset starting from the first date, ending on the last. I'm using it to summarize a much larger dataset, which I'd like summarized by week eventually.

Using this sample:

library(dplyr)

df <- tibble(Date = seq(as.Date("1944/06/1"), as.Date("1944/09/1"), "days"), 
Week = nrow/7)
# A tibble: 93 x 2
   Date        Week
   <date>     <dbl>
 1 1944-06-01 0.143
 2 1944-06-02 0.286
 3 1944-06-03 0.429
 4 1944-06-04 0.571
 5 1944-06-05 0.714
 6 1944-06-06 0.857
 7 1944-06-07 1    
 8 1944-06-08 1.14 
 9 1944-06-09 1.29 
10 1944-06-10 1.43 
# … with 83 more rows

Which definitely isn't right. Also, my real dataset isn't structured sequentially, there are many days missing between weeks so a straight sequential count won't work.

An ideal end result is an additional "week" column, based upon the actual dates (rather than hard-coded with a seq_along() type of result)

Upvotes: 0

Views: 277

Answers (2)

Paul
Paul

Reputation: 2959

Similar solution to Ronak's but with lubridate:

library(lubridate)

(df <- tibble(Date = seq(as.Date("1944/06/1"), as.Date("1944/09/1"), "days"),
              week = interval(min(Date), Date) %>%
                as.duration() %>%
                as.numeric("weeks") %>%
                floor() + 1))

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

You could subtract all the Date values with the first Date and calculate the difference using difftime in "weeks", floor all the values and add 1 to start the counter from 1.

df$week <- floor(as.numeric(difftime(df$Date, df$Date[1], units = "weeks"))) + 1


df
# A tibble: 93 x 2
#   Date        week
#   <date>     <dbl>
# 1 1944-06-01     1
# 2 1944-06-02     1
# 3 1944-06-03     1
# 4 1944-06-04     1
# 5 1944-06-05     1
# 6 1944-06-06     1
# 7 1944-06-07     1
# 8 1944-06-08     2
# 9 1944-06-09     2
#10 1944-06-10     2
# … with 83 more rows

To use this in your dplyr pipe you could do

library(dplyr)
df %>%
 mutate(week = floor(as.numeric(difftime(Date, first(Date), units = "weeks"))) + 1)

data

df <- tibble::tibble(Date = seq(as.Date("1944/06/1"), as.Date("1944/09/1"), "days"))

Upvotes: 1

Related Questions