Reputation: 2445
I have a pandas dataframe with a column of timestamps and a column of timezones the timestamps are in. What's the best way to convert all these timestamps to UTC time?
Sample data in csv:
0,2000-01-28 16:47:00,America/Chicago
1,2000-01-29 16:48:00,America/Chicago
2,2000-01-30 16:49:00,America/Los_Angeles
3,2000-01-31 16:50:00,America/Chicago
4,2000-01-01 16:50:00,America/New_York
Upvotes: 3
Views: 1517
Reputation: 129038
This can be efficiently done by converting a single tz at a time (but since we have many, groupby already separates these out). These are local times (IOW in the given timezone), so tz_localize
makes these tz-aware. Then when we combine them these are auto-magically converted to UTC.
Note this is on master/0.17.0, releasing soon. Soln for < 0.17.0 is below
In [19]: df = read_csv(StringIO(data),header=None, names=['value','date','tz'])
In [20]: df.dtypes
Out[20]:
value int64
date object
tz object
dtype: object
In [21]: df
Out[21]:
value date tz
0 0 2000-01-28 16:47:00 America/Chicago
1 1 2000-01-29 16:48:00 America/Chicago
2 2 2000-01-30 16:49:00 America/Los_Angeles
3 3 2000-01-31 16:50:00 America/Chicago
4 4 2000-01-01 16:50:00 America/New_York
In [22]: df['utc'] = df.groupby('tz').date.apply(
lambda x: pd.to_datetime(x).dt.tz_localize(x.name))
In [23]: df
Out[23]:
value date tz utc
0 0 2000-01-28 16:47:00 America/Chicago 2000-01-28 22:47:00
1 1 2000-01-29 16:48:00 America/Chicago 2000-01-29 22:48:00
2 2 2000-01-30 16:49:00 America/Los_Angeles 2000-01-31 00:49:00
3 3 2000-01-31 16:50:00 America/Chicago 2000-01-31 22:50:00
4 4 2000-01-01 16:50:00 America/New_York 2000-01-01 21:50:00
In [24]: df.dtypes
Out[24]:
value int64
date object
tz object
utc datetime64[ns]
dtype: object
In < 0.17.0, need to:
df['utc'] = df['utc'].dt.tz_localize(None)
to convert to UTC
Upvotes: 3
Reputation: 636
In general: combine the 2 csv time columns during the import (or before). This can be done with a small lambda-function.
To convert (parse) that combined info, several options exist. Most are described here or in the pandas-docs. Personally I like the utils.parse
one.
Upvotes: 1