Benjamin
Benjamin

Reputation: 3477

Pandas: get the monday date from columns containing year and week number

I have a dataframe like:

data = {'year': [2020, 2020, 2021, 2021], 'week': [52, 53, 1, 2]}
df = pd.DataFrame(data=data)

   year  week
0  2020    52
1  2020    53
2  2021     1
3  2021     2

and I would like to get for each line the Monday (date) of that period so something like:

   year  week  period
0  2020    52  2020-12-21
1  2020    53  2020-12-28
2  2021     1  2021-01-04
3  2021     2  2021-01-11

What is the correct way to do this in pandas?

Upvotes: 1

Views: 285

Answers (2)

Renaud
Renaud

Reputation: 2819

You can try with:

data = {'year': [2020, 2020, 2021, 2021], 'week': [52, 53, 1, 2]}
df = pd.DataFrame(data=data)

df['date combined']=df['year'].astype(str)+'-'+df['week'].astype(str)+ '-1'
df['date of Monday']=pd.to_datetime(df['date combined'], format='%Y-%W-%w')

print(df)

result:

   year  week date combined       date
0  2020    52     2020-52-1 2020-12-28
1  2020    53     2020-53-1 2021-01-04
2  2021     1      2021-1-1 2021-01-04
3  2021     2      2021-2-1 2021-01-11

Upvotes: 0

Benjamin
Benjamin

Reputation: 3477

The right way to do this is to use ISO date format pattern:

df["period"] = pd.to_datetime(
    df.year.astype(str) + '-W' + df.week.astype(str) + '-1',
    format='%G-W%V-%u')\
    .dt.strftime('%Y-%m-%d')

and not

df["period"] = pd.to_datetime(
    df.year.astype(str) + '-W' + df.week.astype(str) + '-1',
    format='%Y-W%W-%w')\
    .dt.strftime('%Y-%m-%d')

Because of ISO week number (cf comments)

Upvotes: 1

Related Questions