MasterScrat
MasterScrat

Reputation: 7366

Parsing dates with Pandas: how to take time zones into account?

I have dates in these formats:

Thursday, September 22, 2016 at 11:04am UTC+02
Monday, January 22, 2018 at 6:46pm CST
...

I want to convert them to UNIX timestamps. This pattern works, but it ignores the timezone:

timestamp = pd.to_datetime(date, format='%A, %B %d, %Y at %H:%M%p', exact=False)

I don't see how to take the timezones ("UTC+02, "CST") into account.

This doesn't work:

timestamp = pd.to_datetime(date, format='%A, %B %d, %Y at %H:%M%p %Z')
# ValueError: unconverted data remains: +02

Upvotes: 1

Views: 1156

Answers (2)

chthonicdaemon
chthonicdaemon

Reputation: 19770

I know you asked for a Pandas solution, but dateutil handles your strings correctly:

import dateutil
from dateutil.tz import gettz

samples = ['Thursday, September 22, 2016 at 11:04am UTC+02',
           'Monday, January 22, 2018 at 6:46pm CST']

# American time zone abbreviations
tzinfos = {'HAST': gettz('Pacific/Honolulu'),
           'AKST': gettz('America/Anchorage'),
           'PST': gettz('America/Los Angeles'),
           'MST': gettz('America/Phoenix'),
           'CST': gettz('America/Chicago'),
           'EST': gettz('America/New York'),
          }

for s in samples:
    parsed = dateutil.parser.parse(s, fuzzy=True, tzinfos=tzinfos)
    print(s, '->', parsed)

Output:

Thursday, September 22, 2016 at 11:04am UTC+02 -> 2016-09-22 11:04:00-02:00
Monday, January 22, 2018 at 6:46pm CST -> 2018-01-22 18:46:00-06:00

Upvotes: 1

Hasan Jawad
Hasan Jawad

Reputation: 317

# ValueError: unconverted data remains: +02 Is because you should parse the whole date string when using strptime, in which you are leaving the %z part. But you can't use %z in strptime, see ISO to datetime object: 'z' is a bad directive.

So maybe you could do some sort of mapping on your data:

timestamp = date.map(lambda x : dateutil.parser.parse(x))

Upvotes: 0

Related Questions