Reputation: 49
I have a pandas dataframe with UNIX timestamps (these are integers and not time objects). I'd like to convert the UNIX timestamps into local time (according to China timezone). So, based on this, I tried to do the following:
import pandas as pd
data = {'timestamp': [1540651297, 1540651300, 1540651303, 1540651306, 1540651309, 1540651312]}
df = pd.DataFrame (data, columns = ['timestamp'])
df
df['timestamp1'] = pd.to_datetime(df.timestamp, unit='s')
df['timestamp2']=df['timestamp'].apply(lambda d: datetime.datetime.fromtimestamp(int(d)).strftime('%Y-%m-%d %H:%M:%S'))
df['timestamp3'] = df['timestamp1'].dt.tz_localize('Asia/Shanghai').dt.tz_convert('UTC')
timestamp | timestamp1 | timestamp2 | timestamp3 |
---|---|---|---|
1540651297 | 2018-10-27 14:41:37 | 2018-10-27 22:41:37 | 2018-10-27 06:41:37+00:00 |
1540651300 | 2018-10-27 14:41:40 | 2018-10-27 22:41:40 | 2018-10-27 06:41:40+00:00 |
1540651303 | 2018-10-27 14:41:43 | 2018-10-27 22:41:43 | 2018-10-27 06:41:43+00:00 |
1540651306 | 2018-10-27 14:41:46 | 2018-10-27 22:41:46 | 2018-10-27 06:41:46+00:00 |
1540651309 | 2018-10-27 14:41:49 | 2018-10-27 22:41:49 | 2018-10-27 06:41:49+00:00 |
Upvotes: 1
Views: 1242
Reputation: 423
df['timestamp1'] = pd.to_datetime(df.timestamp, unit='s')
This statement here creates a column with datetime value with current time. The datetime values are time-zone naive and in UTC.
df['timestamp2']=df['timestamp'].apply(lambda d: datetime.datetime.fromtimestamp(int(d)).strftime('%Y-%m-%d %H:%M:%S'))
datetime.datetime.fromtimestamp
takes in a timestamp and returns a local datetime. For ASIA/Shanghai
, the offset from UTC would be +8. The datetime values are still time-zone naive.
df['timestamp1'].dt.tz_localize('Asia/Shanghai')
This returns a Series with time-zone aware datetime using time-zone naive one(timestamp1
).
2018-10-27 14:41:37
becomes 2018-10-27 14:41:37+08:00
.
df['timestamp1'].dt.tz_localize('Asia/Shanghai').dt.tz_convert('UTC')
The dt.tz_convert('UTC')
converts tz-aware datetime from one time zone to another.
2018-10-27 14:41:37+08:00
is converted to UTC datetime with time-zone as 2018-10-27 06:41:37+00:00
.
What you should have done instead is
df['timestamp3'] = df['timestamp1'].dt.tz_localize('UTC').dt.tz_convert('Asia/Shanghai')
Which converts the time zone naive UTC datetime to time zone aware UTC datetime and then to the Asia/Shanghai
time-zone
The result would be:
timestamp | timestamp1 | timestamp2 | timestamp3 |
---|---|---|---|
1540651297 | 2018-10-27 14:41:37 | 2018-10-27 22:41:37 | 2018-10-27 22:41:37+08:00 |
1540651300 | 2018-10-27 14:41:40 | 2018-10-27 22:41:40 | 2018-10-27 22:41:40+08:00 |
1540651303 | 2018-10-27 14:41:43 | 2018-10-27 22:41:43 | 2018-10-27 22:41:43+08:00 |
1540651306 | 2018-10-27 14:41:46 | 2018-10-27 22:41:46 | 2018-10-27 22:41:46+08:00 |
1540651309 | 2018-10-27 14:41:49 | 2018-10-27 22:41:49 | 2018-10-27 22:41:49+08:00 |
Upvotes: 2