Vivian
Vivian

Reputation: 105

First week of year considering the first day last year

I have the following df:

time_series                  date   sales
store_0090_item_85261507     1/2020   1,0
store_0090_item_85261501     2/2020   0,0
store_0090_item_85261500     3/2020   6,0

Being 'date' = Week/Year. So, I tried use the following code:

df['date'] = df['date'].apply(lambda x: datetime.strptime(x + '/0', "%U/%Y/%w"))

But, return this df:

time_series                  date        sales
store_0090_item_85261507     2020-01-05   1,0
store_0090_item_85261501     2020-01-12   0,0
store_0090_item_85261500     2020-01-19   6,0

But, the first day of the first week of 2020 is 2019-12-29, considering sunday as first day. How can I have the first day 2020-12-29 of the first week of 2020 and not 2020-01-05?

Upvotes: 2

Views: 123

Answers (2)

Abhi_J
Abhi_J

Reputation: 2129

From the datetime module's documentation:

%U: Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.

Edit: My originals answer doesn't work for input 1/2023 and using ISO 8601 date values doesn't work for 1/2021, so I've edited this answer by adding a custom function

Here is a way with a custom function

import pandas as pd
from datetime import datetime, timedelta
##############################################
# to demonstrate issues with certain dates
print(datetime.strptime('0/2020/0', "%U/%Y/%w")) # 2019-12-29 00:00:00
print(datetime.strptime('1/2020/0', "%U/%Y/%w")) # 2020-01-05 00:00:00

print(datetime.strptime('0/2021/0', "%U/%Y/%w")) # 2020-12-27 00:00:00
print(datetime.strptime('1/2021/0', "%U/%Y/%w")) # 2021-01-03 00:00:00

print(datetime.strptime('0/2023/0', "%U/%Y/%w")) # 2023-01-01 00:00:00
print(datetime.strptime('1/2023/0', "%U/%Y/%w")) # 2023-01-01 00:00:00
#################################################

df = pd.DataFrame({'date':["1/2020", "2/2020", "3/2020", "1/2021", "2/2021", "1/2023", "2/2023"]})
print(df)

def get_first_day(date):
    date0 = datetime.strptime('0/' + date.split('/')[1] + '/0', "%U/%Y/%w")
    date1 = datetime.strptime('1/' + date.split('/')[1] + '/0', "%U/%Y/%w")
    date = datetime.strptime(date + '/0', "%U/%Y/%w")
    return date if date0 == date1 else date - timedelta(weeks=1)

df['new_date'] = df['date'].apply(lambda x:get_first_day(x))
print(df)

Input

     date
0  1/2020
1  2/2020
2  3/2020
3  1/2021
4  2/2021
5  1/2023
6  2/2023

Output

     date   new_date
0  1/2020 2019-12-29
1  2/2020 2020-01-05
2  3/2020 2020-01-12
3  1/2021 2020-12-27
4  2/2021 2021-01-03
5  1/2023 2023-01-01
6  2/2023 2023-01-08

Upvotes: 1

FObersteiner
FObersteiner

Reputation: 25544

You'll want to use ISO week parsing directives, Ex:

import pandas as pd

date = pd.Series(["1/2020", "2/2020", "3/2020"])

pd.to_datetime(date+"/1", format="%V/%G/%u")

0   2019-12-30
1   2020-01-06
2   2020-01-13
dtype: datetime64[ns]

you can also shift by one day if the week should start on Sunday:

pd.to_datetime(date+"/1", format="%V/%G/%u") - pd.Timedelta('1d')
 
0   2019-12-29
1   2020-01-05
2   2020-01-12
dtype: datetime64[ns]

Upvotes: 1

Related Questions