DKM
DKM

Reputation: 1801

Pandas extract week of year and year from date

I caught up with this scenario and don't know how can I solve this. I have the data frame where I am trying to add "week_of_year" and "year" column based in the "date" column of the pandas' data frame which is working fine.

import pandas as pd
df = pd.DataFrame({'date': ['2018-12-31', '2019-01-01', '2019-12-31', '2020-01-01']})
df['date'] = pd.to_datetime(df['date'])
df['week_of_year'] = df['date'].apply(lambda x: x.weekofyear)
df['year'] = df['date'].apply(lambda x: x.year)
print(df)

Current Output

       date       week_of_year    year
0    2018-12-31      1            2018
1    2019-01-01      1            2019
2    2019-12-31      1            2019
3    2020-01-01      1            2020

Expected Output

So here what I am expecting is for 2018 and 2019 the last date was the first week of the new year which is 2019 and 2020 respectively so I want to add logic in the year, where the week is 1 but the date belongs for the previous year so the year column would track that as in the expected output.

           date       week_of_year    year
    0    2018-12-31      1            2019
    1    2019-01-01      1            2019
    2    2019-12-31      1            2020
    3    2020-01-01      1            2020

Upvotes: 1

Views: 7318

Answers (2)

LeeRuns
LeeRuns

Reputation: 514

TLDR CODE

To get the week number as a series

df['DATE'].dt.isocalendar().week

To set a new column to the week use same function and set series returned to a column:

df['WEEK'] = df['DATE'].dt.isocalendar().week

TLDR EXPLANATION

Use the pd.series.dt.isocalendar().week to get the the week for a given series object.

Note:

  • column "DATE" must be stored as a datetime column

Upvotes: 1

Georgina Skibinski
Georgina Skibinski

Reputation: 13387

Try:

df['date'] = pd.to_datetime(df['date'])
df['week_of_year'] = df['date'].dt.weekofyear
df['year']=(df['date']+pd.to_timedelta(6-df['date'].dt.weekday, unit='d')).dt.year

Outputs:

        date  week_of_year  year
0 2018-12-31             1  2019
1 2019-01-01             1  2019
2 2019-12-31             1  2020
3 2020-01-01             1  2020

Few things - generally avoid .apply(..).

For datetime columns you can just interact with the date through df[col].dt variable.

Then to get the last day of the week just add to date 6-weekday where weekday is between 0 (Monday) and 6 to the date

Upvotes: 3

Related Questions