Sokolokki
Sokolokki

Reputation: 863

Converting "year" and "week of year" columns to "date" in Pandas

Desired way

In order to convert two columns with year and week of year into date I would expect to do something like:

df['formatted_date'] = df.year*100+df.weekofyear
df['date'] = pd.to_datetime(df['formatted_date'], format='%Y%w')

However, it does not work, giving ValueError:

ValueError: unconverted data remains: 01

Workaround

The workaround, I have found is converting week of year into a day of year and working with year-dayofyear %Y%j format:

df['formatted_date'] = df.year*1000+df.weekofyear*7-6 
df['date'] = pd.to_datetime(df['formatted_date'], format='%Y%j')

The first line becomes ugly, but this works fine. Week of year is in the range (00,53). Any ideas, why is the elegant way not working?

Upvotes: 8

Views: 14546

Answers (1)

jezrael
jezrael

Reputation: 862511

You need combine %w for day of week - explanation with %W for week:

http://strftime.org/ for %W:

Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.

And for %w:

Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.


df = pd.DataFrame({'year':[2015, 2018],
                   'weekofyear':[10,12]})

dates = df.year*100+df.weekofyear
@adde
df['date'] = pd.to_datetime(dates.astype(str) + '0', format='%Y%W%w')
print (df)

   year  weekofyear  formatted_date       date
0  2015  10          201510         2015-03-15
1  2018  12          201812         2018-03-25

Another solution:

#added 0 only for demontration, you can remove it
df['formatted_date'] = df.year * 1000 + df.weekofyear * 10 + 0
df['date'] = pd.to_datetime(df['formatted_date'], format='%Y%W%w')
print (df)

   year  weekofyear  formatted_date       date
0  2015  10          2015100        2015-03-15
1  2018  12          2018120        2018-03-25

Upvotes: 11

Related Questions