M_Fatih89
M_Fatih89

Reputation: 49

Why pandas convert UNIX timestamps to multiple different date-time values?

I have a pandas dataframe with UNIX timestamps (these are integers and not time objects). I'd like to convert the UNIX timestamps into local time (according to China timezone). So, based on this, I tried to do the following:

import pandas as pd
data = {'timestamp':  [1540651297,  1540651300, 1540651303,  1540651306,  1540651309,  1540651312]}
df = pd.DataFrame (data, columns = ['timestamp'])
df
df['timestamp1'] = pd.to_datetime(df.timestamp, unit='s')
df['timestamp2']=df['timestamp'].apply(lambda d: datetime.datetime.fromtimestamp(int(d)).strftime('%Y-%m-%d %H:%M:%S'))
df['timestamp3'] = df['timestamp1'].dt.tz_localize('Asia/Shanghai').dt.tz_convert('UTC')
timestamp timestamp1 timestamp2 timestamp3
1540651297 2018-10-27 14:41:37 2018-10-27 22:41:37 2018-10-27 06:41:37+00:00
1540651300 2018-10-27 14:41:40 2018-10-27 22:41:40 2018-10-27 06:41:40+00:00
1540651303 2018-10-27 14:41:43 2018-10-27 22:41:43 2018-10-27 06:41:43+00:00
1540651306 2018-10-27 14:41:46 2018-10-27 22:41:46 2018-10-27 06:41:46+00:00
1540651309 2018-10-27 14:41:49 2018-10-27 22:41:49 2018-10-27 06:41:49+00:00

Upvotes: 1

Views: 1242

Answers (1)

Zoro
Zoro

Reputation: 423

df['timestamp1'] = pd.to_datetime(df.timestamp, unit='s')

This statement here creates a column with datetime value with current time. The datetime values are time-zone naive and in UTC.

df['timestamp2']=df['timestamp'].apply(lambda d: datetime.datetime.fromtimestamp(int(d)).strftime('%Y-%m-%d %H:%M:%S'))

datetime.datetime.fromtimestamp takes in a timestamp and returns a local datetime. For ASIA/Shanghai, the offset from UTC would be +8. The datetime values are still time-zone naive.

df['timestamp1'].dt.tz_localize('Asia/Shanghai')

This returns a Series with time-zone aware datetime using time-zone naive one(timestamp1).

2018-10-27 14:41:37 becomes 2018-10-27 14:41:37+08:00.

df['timestamp1'].dt.tz_localize('Asia/Shanghai').dt.tz_convert('UTC')

The dt.tz_convert('UTC') converts tz-aware datetime from one time zone to another. 2018-10-27 14:41:37+08:00 is converted to UTC datetime with time-zone as 2018-10-27 06:41:37+00:00. What you should have done instead is

df['timestamp3'] = df['timestamp1'].dt.tz_localize('UTC').dt.tz_convert('Asia/Shanghai')

Which converts the time zone naive UTC datetime to time zone aware UTC datetime and then to the Asia/Shanghai time-zone The result would be:

timestamp timestamp1 timestamp2 timestamp3
1540651297 2018-10-27 14:41:37 2018-10-27 22:41:37 2018-10-27 22:41:37+08:00
1540651300 2018-10-27 14:41:40 2018-10-27 22:41:40 2018-10-27 22:41:40+08:00
1540651303 2018-10-27 14:41:43 2018-10-27 22:41:43 2018-10-27 22:41:43+08:00
1540651306 2018-10-27 14:41:46 2018-10-27 22:41:46 2018-10-27 22:41:46+08:00
1540651309 2018-10-27 14:41:49 2018-10-27 22:41:49 2018-10-27 22:41:49+08:00

Upvotes: 2

Related Questions