9879ypxkj
9879ypxkj

Reputation: 437

Python: How to parse dates in mixed format?

I have to parse a date column that is in mixed format:

0         1972-12-31
1         1980-03-31
2         1980-03-31
3         1973-08-31
4         1985-06-28
            ...     
44215    2017 Nov 17
44216     2009-02-13
44217     2018 Jul 3
44218     2011-03-15
44219     2017 Nov 8
Name: publish_time, Length: 44220, dtype: object

I try to parse it with pandas:

pd.datetime.strptime(metadata['publish_time'], '%Y-%m-%d')

But it gives me this error:

/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime instead.
  """Entry point for launching an IPython kernel.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-83-fa9f0e16e2d9> in <module>()
----> 1 pd.datetime.strptime(metadata['publish_time'], '%Y-%m-%d')

TypeError: strptime() argument 1 must be str, not Series

Any idea how to solve this problem?

Upvotes: 0

Views: 664

Answers (1)

Mayank Porwal
Mayank Porwal

Reputation: 34056

pd.to_datetime is pretty smart when it comes to identifying different date formats.

Something like this would work:

In [153]: df = pd.DataFrame({'date': ['1973-08-31','2017 Nov 17', '2009-02-13','2018 Jul 3']})                                                                                                              

In [154]: df                                                                                                                                                                                                
Out[154]: 
          date
0   1973-08-31
1  2017 Nov 17
2   2009-02-13
3   2018 Jul 3

In [155]:  df['date'] = pd.to_datetime(df['date'])                                                                                                                                                          

In [156]: df                                                                                                                                                                                                
Out[156]: 
        date
0 1973-08-31
1 2017-11-17
2 2009-02-13
3 2018-07-03

Upvotes: 3

Related Questions