Reputation: 403
Being new to python and pandas, I faced next problem. In my dataframe i have column with dates (yyyy-mm-ddThh-mm-sec), where most part of the years are ok (looks like 2008), and a part, where year is written like 0008. Due to this I have problem with formatting column using pd.to_datetime.
My thought was to convert it first into 2-digit year (using pd.to_datetime(df['date']).dt.strftime('%y %b, %d %H:%M:%S.%f +%Z')), but I got an error Out of bounds nanosecond timestamp: 08-10-02 14:41:00.
Are there any other options to convert 0008 to 2008 in dataframe?
Thanks for the help in advance
Upvotes: 2
Views: 256
Reputation: 14113
If the format for the bad data is always the same (as in the bad years are always 4 characters) then you can use str
:
df = pd.DataFrame({'date':['2008-01-01', '0008-01-02']})
df['date'] = pd.to_datetime(df['date'].str[2:], yearfirst=True)
date
0 2008-01-01
1 2008-01-02
Upvotes: 5