Reputation: 405
A data frame with a column dateMillis that contains timestamp as milliseconds from epoch
In [5]: df.head(5)
Out[5]:
custId dateMillis
0 31403 1488232800000
1 28197 1488232800013
2 30158 1488232800026
3 28538 1488232800030
4 28538 1488232800033
If I convert the value of dateMillis in the first row to datetime using python's datetime I get the following
In [8]: print datetime.datetime.fromtimestamp(df.ix[0]['dateMillis']/1000.0)
2017-02-27 17:00:00
which is correct. But if I convert the column dateMillis using panda's to_datetime I get
In [11]: df['date'] = pd.to_datetime(df['dateMillis'], unit='ms')
In [12]: df.head(5)
Out[12]:
custId dateMillis date
0 31403 1488232800000 2017-02-27 22:00:00.000
1 28197 1488232800013 2017-02-27 22:00:00.013
2 30158 1488232800026 2017-02-27 22:00:00.026
3 28538 1488232800030 2017-02-27 22:00:00.030
4 28538 1488232800033 2017-02-27 22:00:00.033
which is five hours ahead. The machine itself has EST time zone, so it seems like pandas converts datetime to the local time zone.
In [15]: time.tzname
Out[15]: ('EST', 'EDT')
I would like to_datetime to be consistent with datetime.fromtimestamp. I tried various values (True/False/None) for utc argument, but it didn't help.
Upvotes: 2
Views: 3760
Reputation: 210882
UNIX timestamp (epoch) per default doesn't know anything of any time zone.
"This count starts at the Unix Epoch on January 1st, 1970 at UTC`.
But you can use the following workaround:
In [47]: datetime.datetime.fromtimestamp(df.loc[0, 'dateMillis']/1000)
Out[47]: datetime.datetime(2017, 2, 27, 23, 0) # 23:00:00
In UTC TZ:
In [48]: pd.to_datetime(df['dateMillis'], unit='ms')
Out[48]:
0 2017-02-27 22:00:00.000
1 2017-02-27 22:00:00.013
2 2017-02-27 22:00:00.026
3 2017-02-27 22:00:00.030
4 2017-02-27 22:00:00.033
Name: dateMillis, dtype: datetime64[ns]
Solution:
In [51]: from tzlocal import get_localzone
In [52]: mytz = get_localzone()
In [53]: mytz
Out[53]: <DstTzInfo 'Europe/Berlin' LMT+0:53:00 STD>
In [54]: pd.to_datetime(df['dateMillis'], unit='ms').dt.tz_localize('UTC').dt.tz_convert(mytz)
Out[54]:
0 2017-02-27 23:00:00+01:00
1 2017-02-27 23:00:00.013000+01:00
2 2017-02-27 23:00:00.026000+01:00
3 2017-02-27 23:00:00.030000+01:00
4 2017-02-27 23:00:00.033000+01:00
Name: dateMillis, dtype: datetime64[ns, Europe/Berlin]
Upvotes: 3