phaebz
phaebz

Reputation: 423

Pandas DataFrame.to_excel wrong datetime

Given this data

import pandas as pd

dt1 = pd.Timestamp('2018-01-11 23:00:00')
dt2 = pd.Timestamp('2018-01-12 01:00:00')

ts = pd.date_range(dt1, dt2, freq='15Min', tz='Europe/Vienna')
values = list(range(ts.shape[0]))

df = pd.DataFrame(values, index=ts)

where the resulting df looks like

                           0
2018-01-11 23:00:00+01:00  0
2018-01-11 23:15:00+01:00  1
2018-01-11 23:30:00+01:00  2
2018-01-11 23:45:00+01:00  3
2018-01-12 00:00:00+01:00  4
2018-01-12 00:15:00+01:00  5
2018-01-12 00:30:00+01:00  6
2018-01-12 00:45:00+01:00  7
2018-01-12 01:00:00+01:00  8

doing

df.to_excel('test.xlsx')

leads to the following in Excel 2013

2018-01-11 22:00:00 0
2018-01-11 22:15:00 1
2018-01-11 22:30:00 2
2018-01-11 22:45:00 3
2018-01-12 23:00:00 4
2018-01-12 23:15:00 5
2018-01-12 23:30:00 6
2018-01-12 23:45:00 7
2018-01-12 00:00:00 8

that is, the datetime is displayed as UTC but at midnight local time the day is wrong, i.e. 2018-01-12 23:00:00 when it should be 2018-01-11 23:00:00. Now with Excel there could be language settings and display options which interfere, but maybe my Pandas usage is wrong already?

OS language is set to be US English and no Excel specific settings. Pandas version is 0.23.0, Python 3.6

Upvotes: 2

Views: 5087

Answers (2)

Vityata
Vityata

Reputation: 43585

I get this error (with full traceback):

enter image description here

How did you manage to get something with your code? This is what I used:

import pandas as pd

dt1 = pd.Timestamp('2018-01-11 23:00:00')
dt2 = pd.Timestamp('2018-01-12 01:00:00')

ts = pd.date_range(dt1, dt2, freq='15Min', tz='Europe/Vienna')
values = list(range(ts.shape[0]))

df = pd.DataFrame(values, index=ts)
df.to_excel('test.xlsx')
print ("ready")

If I remove , tz='Europe/Vienna' then it works quite ok:

enter image description here

And in Python I get the same, using print (df):

enter image description here

Upvotes: 0

SciPy
SciPy

Reputation: 6090

Excel doesn't support timezones in datetimes. Set the tzinfo in the datetime/time object to None or use the 'remove_timezone'

ts = pd.date_range(dt1, dt2, freq='15Min')

Upvotes: 1

Related Questions