Reputation: 874
I have the following DataFrame:
df = pd.DataFrame({'A':[1,2,3],'B':[4,3,2]},index = ['201701','201702','201703'])
where the index of string values are dates in the format YYYYQQ (quarterly data).
When I try to convert this into a datetime object, I got the error:
pd.to_datetime(df.index)
....
ValueError: month must be in 1...12
I feel this has to be due to the format the to_datetime is inferring df.index is, but I can't find a work around. Any help?
Update: @Zero's answer also works, but this ended up being a solution too:
pd.to_datetime([x[:-2] + str(int(x[-2:])*3) for x in df.index], format = '%Y%m')
Upvotes: 1
Views: 4279
Reputation: 210842
I'd use Pandas Period:
In [92]: x = pd.PeriodIndex(df.index.astype(str).str.replace(r'0(\d)$', r'q\1'), freq='Q')
In [93]: x
Out[93]: PeriodIndex(['2017Q1', '2017Q2', '2017Q3'], dtype='period[Q-DEC]', freq='Q-DEC')
In [94]: x.to_timestamp()
Out[94]: DatetimeIndex(['2017-01-01', '2017-04-01', '2017-07-01'], dtype='datetime64[ns]', freq='QS-OCT')
Upvotes: 1
Reputation: 76917
Use
In [2325]: [pd.to_datetime(x[:4]) + pd.offsets.QuarterBegin(int(x[5:])) for x in df.index]
Out[2325]:
[Timestamp('2017-03-01 00:00:00'),
Timestamp('2017-06-01 00:00:00'),
Timestamp('2017-09-01 00:00:00')]
Upvotes: 2