wololo
wololo

Reputation: 345

Parsing two different formats of dates in data frame

I have a column having 2 different formats of the date which I'm trying to convert to datetime using to_datetime of pandas Here's the code

import pandas as pa
pa.to_datetime(data["servertime"], format="%a %b %d %H:%M:%S %Y")

e.g - servertime Tue Nov 4 12:01:15 2014

But few rows have data in following format u'2014-11-04 13:15:13 +0000' which throws the errors

How do I parse two different formats present in same row?

If I can't then how do I convert/remove/flag the rows(preferably without hard coding the condition)?

Upvotes: 1

Views: 327

Answers (1)

MattDMo
MattDMo

Reputation: 102842

Instead of using to_datetime(), first parse your strings with dateutil.parser.parse():

In [2]: from dateutil.parser import parse

In [3]: dt1 = "Tue Nov 4 12:01:15 2014"

In [4]: dt2 = "2014-11-04 13:15:13 +0000"

In [5]: parse(dt1)
Out[5]: datetime.datetime(2014, 11, 4, 12, 1, 15)

In [6]: parse(dt2)
Out[6]: datetime.datetime(2014, 11, 4, 13, 15, 13, tzinfo=tzutc())

You can then feed the datetime.datetime values into your dataframe.

Upvotes: 1

Related Questions