Pandas parse week numbers

Question

Consider the following file test.csv:

"Time","RegionCode","RegionName","NumValue"
"2009-W40","AT","Austria",0
"2009-W40","BE","Belgium",54
"2009-W40","BG","Bulgaria",0
"2009-W40","CZ","Czech Republic",1

I'd like to parse the date which is stored in the first column and would like to create a dataframe like so:

parser = lambda x: pd.datetime.strptime(x, "%Y-W%W")
df = pd.read_csv("test.csv", parse_dates=["Time"], date_parser=parser)

Result:

    Time    RegionCode  RegionName  NumValue
0   2009-01-01  AT  Austria 0
1   2009-01-01  BE  Belgium 54
2   2009-01-01  BG  Bulgaria    0
3   2009-01-01  CZ  Czech Republic  1

However, the resulting time column is not correct. All I get is "2019-01-01" and this is certainly not the 40th week of the year. Am I doing something wrong? Anybody else had this issue when parsing weeks?

KenHBS · Accepted Answer

You are almost correct. The only problem is that from a week number and year, you cannot determine a specific date. The trick is to just add day of the week as 1.

I would recommend sticking with pd.to_datetime() like you tried initially and supplying a date-format string. That should work out fine with the added 1:

pd.to_datetime(df['Time'] + '-1', format='%Y-W%W-%w')
# 0   2009-10-05
# 1   2009-10-05
# 2   2009-10-05
# 3   2009-10-05

Pandas parse week numbers

Answers (2)

Related Questions