Nandu
Nandu

Reputation: 3126

R: How to get the Week number of the month

I am new in R.
I want the week number of the month, which the date belongs to.

By using the following code:

>CurrentDate<-Sys.Date()
>Week Number <- format(CurrentDate, format="%U")
>Week Number
"31"

%U will return the Week number of the year .
But i want the week number of the month.
If the date is 2014-08-01 then i want to get 1.( The Date belongs to the 1st week of the month).

Eg:
2014-09-04 -> 1 (The Date belongs to the 1st week of the month).
2014-09-10 -> 2 (The Date belongs to the 2nd week of the month).
and so on...

How can i get this?

Reference: http://astrostatistics.psu.edu/su07/R/html/base/html/strptime.html

Upvotes: 26

Views: 29968

Answers (10)

Henrique L.R.
Henrique L.R.

Reputation: 7

Simply do this:

library(lubridate)

ds1$Week <- week(ds1$Sale_Date)

This is high performance! It instantly works on my 12 milion rows dataset. On example above, ds1 is the dataset, Sale_Date is a date column (like "2015-11-23") The other approach, using "as.integer( format..." might work on small datasets, but on 12 million rows it would keep running forever...

Upvotes: -1

Georgery
Georgery

Reputation: 8117

I am late to the party and maybe noone is gonna read this answer...

Anyway, why not stay simple and do it like this:

library(lubridate)

x <- ymd(20200311, 20200308)

week(x) - week(floor_date(x, unit = "months")) + 1

[1] 3 2

Upvotes: 1

Tori Oblad
Tori Oblad

Reputation: 325

Issue Overview

It was difficult to tell which answers worked, so I built my own function nth_week and tested it against the others.

The issue that's leading to most of the answers being incorrect is this:

  • The first week of a month is often a short-week
  • Same with the last week of the month

For example, October 1st 2019 is a Tuesday, so 6 days into October (which is a Sunday) is already the second week. Also, contiguous months often share the same week in their respective counts, meaning that the last week of the prior month is commonly also the first week of the current month. So, we should expect a week count higher than 52 per year and some months that contain a span of 6 weeks.

Results Comparison

Here's a table showing examples where some of the above suggested algorithms go awry:

DATE            Tori user206 Scri Klev Stringi Grot Frei Vale epi iso coni
Fri-2016-01-01    1     1      1   1      5      1    1    1    1   1   1
Sat-2016-01-02    1     1      1   1      1      1    1    1    1   1   1
Sun-2016-01-03    2     1      1   1      1      2    2    1  -50   1   2
Mon-2016-01-04    2     1      1   1      2      2    2    1  -50 -51   2
----
Sat-2018-12-29    5     5      5   5      5      5    5    4    5   5   5
Sun-2018-12-30    6     5      5   5      5      6    6    4  -46   5   6
Mon-2018-12-31    6     5      5   5      6      6    6    4  -46 -46   6
Tue-2019-01-01    1     1      1   1      6      1    1    1    1   1   1

You can see that only Grothendieck, conighion, Freitas, and Tori are correct due to their treatment of partial week periods. I compared all days from year 100 to year 3000; there are no differences among those 4. (Stringi is probably correct for noting weekends as separate, incremented periods, but I didn't check to be sure; epiweek() and isoweek(), because of their intended uses, show some odd behavior near year-ends when using them for week incrementation.)

Speed Comparison

Below are the tests for efficiency between the implementations of: Tori, Grothendieck, Conighion, and Freitas

# prep
library(lubridate)
library(tictoc)

kepler<- ymd(15711227) # Kepler's birthday since it's a nice day and gives a long vector of dates
some_dates<- seq(kepler, today(), by='day')

# test speed of Tori algorithm
tic(msg = 'Tori')
Tori<- (5 + day(some_dates) + wday(floor_date(some_dates, 'month'))) %/% 7
toc()
Tori: 0.19 sec elapsed
# test speed of Grothendieck algorithm
wk <- function(x) as.numeric(format(x, "%U"))
tic(msg = 'Grothendieck')
Grothendieck<- (wk(some_dates) - wk(as.Date(cut(some_dates, "month"))) + 1)
toc()
Grothendieck: 1.99 sec elapsed
# test speed of conighion algorithm
tic(msg = 'conighion')
weeknum <- as.integer( format(some_dates, format="%U") )
mindatemonth <- as.Date( paste0(format(some_dates, "%Y-%m"), "-01") )
weeknummin <- as.integer( format(mindatemonth, format="%U") ) # the number of the week of the first week within the month
conighion <- weeknum - (weeknummin - 1) # this is as an integer
toc()
conighion: 2.42 sec elapsed
# test speed of Freitas algorithm
first_day_of_month_wday <- function(dx) {
   day(dx) <- 1
   wday(dx)
 }
tic(msg = 'Freitas')
Freitas<- ceiling((day(some_dates) + first_day_of_month_wday(some_dates) - 1) / 7)
toc()
Freitas: 0.97 sec elapsed



Fastest correct algorithm by about at least 5X

require(lubridate)

(5 + day(some_dates) + wday(floor_date(some_dates, 'month'))) %/% 7

# some_dates above is any vector of dates, like:
some_dates<- seq(ymd(20190101), today(), 'day')



Function Implementation

I also wrote a generalized function for it that performs either month or year week counts, begins on a day you choose (i.e. say you want to start your week on Monday), labels output for easy checking, and is still extremely fast thanks to lubridate.

nth_week<- function(dates = NULL,
                    count_weeks_in = c("month","year"),
                    begin_week_on = "Sunday"){

  require(lubridate)

  count_weeks_in<- tolower(count_weeks_in[1])

  # day_names and day_index are for beginning the week on a day other than Sunday
  # (this vector ordering matters, so careful about changing it)
  day_names<- c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday")

  # index integer of first match
  day_index<- pmatch(tolower(begin_week_on),
                     tolower(day_names))[1]


  ### Calculate week index of each day

  if (!is.na(pmatch(count_weeks_in, "year"))) {

    # For year:
    # sum the day of year, index for day of week at start of year, and constant 5 
    #  then integer divide quantity by 7   
    # (explicit on package so lubridate and data.table don't fight)
    n_week<- (5 + 
                lubridate::yday(dates) + 
                lubridate::wday(floor_date(dates, 'year'), 
                                week_start = day_index)
    ) %/% 7

  } else {

    # For month:
    # same algorithm as above, but for month rather than year
    n_week<- (5 + 
                lubridate::day(dates) + 
                lubridate::wday(floor_date(dates, 'month'), 
                                week_start = day_index)
    ) %/% 7

  }

  # naming very helpful for review
  names(n_week)<- paste0(lubridate::wday(dates,T), '-', dates)

  n_week

}



Function Output

# Example raw vector output: 
some_dates<- seq(ymd(20190930), today(), by='day')
nth_week(some_dates)

Mon-2019-09-30 Tue-2019-10-01 Wed-2019-10-02 
             5              1              1 
Thu-2019-10-03 Fri-2019-10-04 Sat-2019-10-05 
             1              1              1 
Sun-2019-10-06 Mon-2019-10-07 Tue-2019-10-08 
             2              2              2 
Wed-2019-10-09 Thu-2019-10-10 Fri-2019-10-11 
             2              2              2 
Sat-2019-10-12 Sun-2019-10-13 
             2              3 
# Example tabled output:
library(tidyverse)

nth_week(some_dates) %>% 
  enframe('DATE','nth_week_default') %>% 
  cbind(some_year_day_options = as.vector(nth_week(some_dates, count_weeks_in = 'year', begin_week_on = 'Mon')))

             DATE nth_week_default some_year_day_options
1  Mon-2019-09-30                5                    40
2  Tue-2019-10-01                1                    40
3  Wed-2019-10-02                1                    40
4  Thu-2019-10-03                1                    40
5  Fri-2019-10-04                1                    40
6  Sat-2019-10-05                1                    40
7  Sun-2019-10-06                2                    40
8  Mon-2019-10-07                2                    41
9  Tue-2019-10-08                2                    41
10 Wed-2019-10-09                2                    41
11 Thu-2019-10-10                2                    41
12 Fri-2019-10-11                2                    41
13 Sat-2019-10-12                2                    41
14 Sun-2019-10-13                3                    41

Hope this work saves people the time of having to weed through all the responses to figure out which are correct.

Upvotes: 14

Arthur Vaz
Arthur Vaz

Reputation: 401

There is a simple way to do it with lubridate package:

isoweek() returns the week as it would appear in the ISO 8601 system, which uses a reoccurring leap week.

epiweek() is the US CDC version of epidemiological week. It follows same rules as isoweek() but starts on Sunday. In other parts of the world the convention is to start epidemiological weeks on Monday, which is the same as isoweek().

Reference here

Upvotes: 2

conighion
conighion

Reputation: 180

I don't know any build in functions but a work around would be

CurrentDate <- Sys.Date()
# The number of the week relative to the year
weeknum <- as.integer( format(CurrentDate, format="%U") )

# Find the minimum week of the month relative to the year
mindatemonth <- as.Date( paste0(format(CurrentDate, "%Y-%m"), "-01") )
weeknummin <- as.integer( format(mindatemonth, format="%U") ) # the number of the week of the first week within the month

# Calculate the number of the week relative to the month
weeknum <- weeknum - (weeknummin - 1) # this is as an integer

# With the following you can convert the integer to the same format of 
# format(CurrentDate, format="%U")
formatC(weeknum, width = 2, flag = "0")

Upvotes: 0

Wilson Freitas
Wilson Freitas

Reputation: 531

Using lubridate you can do

ceiling((day(date) + first_day_of_month_wday(date) - 1) / 7)

Where the function first_day_of_month_wday returns the weekday of the first day of month.

first_day_of_month_wday <- function(dx) {
  day(dx) <- 1
  wday(dx)
}

This adjustment must be done in order to get the correct week number otherwise if you have the 7th day of month on a Monday you will get 1 instead of 2, for example. This is only a shift in the day of month. The minus 1 is necessary because when the first day of month is sunday the adjustment is not needed, and the others weekdays follow this rule.

Upvotes: 5

Valentin_Ștefan
Valentin_Ștefan

Reputation: 6416

I came across the same issue and I solved it with mday from data.table package. Also, I realized that when using the ceiling() function, one also needs to account for the '5th week' situation. For example ceiling of the 30th day of a month ceiling(30/7) will give 5 ! Therefore, the ifelse statement below.

# Create a sample data table with days from year 0 until present
DT <- data.table(days = seq(as.Date("0-01-01"), Sys.Date(), "days"))
# compute the week of the month and account for the '5th week' case
DT[, week := ifelse( ceiling(mday(days)/7)==5, 4, ceiling(mday(days)/7) )]

> DT
              days week
     1: 0000-01-01    1
     2: 0000-01-02    1
     3: 0000-01-03    1
     4: 0000-01-04    1
     5: 0000-01-05    1
    ---                
736617: 2016-10-14    2
736618: 2016-10-15    3
736619: 2016-10-16    3
736620: 2016-10-17    3
736621: 2016-10-18    3

To have an idea about the speed, then run:

system.time( DT[, week := ifelse( ceiling(mday(days)/7)==5, 4, ceiling(mday(days)/7) )] )
   # user  system elapsed 
   # 3.23    0.05    3.27

It took approx. 3 seconds to compute the weeks for more than 700 000 days.

However, the ceiling way above will always create the last week longer than all the other weeks (the four weeks have 7,7,7, and 9 or 10 days). Another way would be to use something like

ceiling(1:31/31*4)
 [1] 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4

where you get 7, 8 , 8 and 8 days per respective week in a 31 days month.

DT[, week2 := ceiling(mday(days)/31*4)]

Upvotes: 4

Rich Scriven
Rich Scriven

Reputation: 99331

You can use day from the lubridate package. I'm not sure if there's a week-of-month type function in the package, but we can do the math.

library(lubridate)
curr <- Sys.Date()
# [1] "2014-08-08"
day(curr)               ## 8th day of the current month
# [1] 8
day(curr) / 7           ## Technically, it's the 1.14th week
# [1] 1.142857
ceiling(day(curr) / 7)  ## but ceiling() will take it up to the 2nd week.
# [1] 2

Upvotes: 12

Artem Klevtsov
Artem Klevtsov

Reputation: 9423

By analogy of the weekdays function:

monthweeks <- function(x) {
    UseMethod("monthweeks")
}
monthweeks.Date <- function(x) {
    ceiling(as.numeric(format(x, "%d")) / 7)
}
monthweeks.POSIXlt <- function(x) {
    ceiling(as.numeric(format(x, "%d")) / 7)
}
monthweeks.character <- function(x) {
    ceiling(as.numeric(format(as.Date(x), "%d")) / 7)
}
dates <- sample(seq(as.Date("2000-01-01"), as.Date("2015-01-01"), "days"), 7)
dates
#> [1] "2004-09-24" "2002-11-21" "2011-08-13" "2008-09-23" "2000-08-10" "2007-09-10" "2013-04-16"
monthweeks(dates)
#> [1] 4 3 2 4 2 2 3

Another solution to use stri_datetime_fields() from the stringi package:

stringi::stri_datetime_fields(dates)$WeekOfMonth
#> [1] 4 4 2 4 2 3 3

Upvotes: 16

perlice
perlice

Reputation: 87

I don't know R but if you take the week of the first day in the month you could use it to get the week in the month

2014-09-18
First day of month = 2014-09-01
Week of first day on month = 36
Week of 2014-09-18 = 38
Week in the month = 1 + (38 - 36) = 3

Upvotes: 7

Related Questions