Reputation: 437
I have to parse a date column that is in mixed format:
0 1972-12-31
1 1980-03-31
2 1980-03-31
3 1973-08-31
4 1985-06-28
...
44215 2017 Nov 17
44216 2009-02-13
44217 2018 Jul 3
44218 2011-03-15
44219 2017 Nov 8
Name: publish_time, Length: 44220, dtype: object
I try to parse it with pandas:
pd.datetime.strptime(metadata['publish_time'], '%Y-%m-%d')
But it gives me this error:
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime instead.
"""Entry point for launching an IPython kernel.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-83-fa9f0e16e2d9> in <module>()
----> 1 pd.datetime.strptime(metadata['publish_time'], '%Y-%m-%d')
TypeError: strptime() argument 1 must be str, not Series
Any idea how to solve this problem?
Upvotes: 0
Views: 664
Reputation: 34056
pd.to_datetime
is pretty smart when it comes to identifying different date formats.
Something like this would work:
In [153]: df = pd.DataFrame({'date': ['1973-08-31','2017 Nov 17', '2009-02-13','2018 Jul 3']})
In [154]: df
Out[154]:
date
0 1973-08-31
1 2017 Nov 17
2 2009-02-13
3 2018 Jul 3
In [155]: df['date'] = pd.to_datetime(df['date'])
In [156]: df
Out[156]:
date
0 1973-08-31
1 2017-11-17
2 2009-02-13
3 2018-07-03
Upvotes: 3