Reputation: 1661
I have some data which is formated by year and week. Here is a small sample
week cases
0 192801 7647
1 192802 11427
2 192803 11970
3 192804 12920
4 192805 14660
The week data looks as if it is formated like %Y%W
, but when I try to convert it to datetime via pd.to_datetime(df.week, format = '%Y%W)
I receive the first day of the first month of each year.
0 1928-01-01
1 1928-01-01
2 1928-01-01
3 1928-01-01
4 1928-01-01
What is going on? How can I format the week data correctly?
Upvotes: 6
Views: 5066
Reputation: 594
From this thread, it seems that just the week number is not sufficient to infer the date. You need the day of the week as well. (I'm really not sure why it doesn't just default to 0/Sunday)
So the following seems to do what you want:
pd.to_datetime(df.week.map(lambda x: str(x)+'-0'), format="%Y%W-%w")
We first ensure the data is a str
, then attach a '-0' to the end (which means Sunday).
Then grab the 'day of the week' using an additional %w
in the format.
Note:
Found a note in the datetime docs, which explains the behavior.
Section 8.1.8 -> Notes -> Bullet-point #7
When used with the strptime() method, %U and %W are only used in calculations when the day of the week and the calendar year (%Y) are specified.
Upvotes: 3
Reputation: 5225
After converting to datetime (which I believe requires a day in the date), you can format the date using dt.strftime on the Series object like so:
pd.to_datetime(df.week, format = '%Y%W').dt.strftime('%Y-%W')
Upvotes: 0