M.E.
M.E.

Reputation: 5527

Convert timezone of a pandas column datetime64 from UTC to America/New_York

I have tried the following to change timezone Pandas dataframe:

print(df['column_datetime'].dtypes)
print(df['column_datetime'].tz_localize('America/New_York').dtypes)
print(df['column_datetime'].tz_convert('America/New_York').dtypes)

Which gives me:

datetime64[ns, UTC]
datetime64[ns, UTC]
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pandas/core/generic.py", line 9484, in tz_convert
    ax = _tz_convert(ax, tz)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pandas/core/generic.py", line 9472, in _tz_convert
    ax = ax.tz_convert(tz)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pandas/core/indexes/extension.py", line 78, in method
    result = attr(self._data, *args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py", line 803, in tz_convert
    "Cannot convert tz-naive timestamps, use tz_localize to localize"
TypeError: Cannot convert tz-naive timestamps, use tz_localize to localize

Two questions:

  1. Why tz_localize does not return datetime64[ns,America/New_York]?
  2. Why tz_convert says that timestamp is tz-naive when dtypes shows UTC?

EDIT: answer of this question actually solves this by using tz_convert.

import numpy as np
import pandas as pd
x = pd.Series(np.datetime64('2005-01-03 14:30:00.000000000'))
y = x.dt.tz_localize('UTC')
z = y.dt.tz_convert('America/New_York')
z
---
0   2005-01-03 09:30:00-05:00
dtype: datetime64[ns, America/New_York]

Upvotes: 1

Views: 4823

Answers (1)

Stef
Stef

Reputation: 30679

This situation is only possible if your dataframe has a tz naive datetime index.

import pandas as pd

df = pd.DataFrame({'column_datetime': pd.to_datetime('2005-01-03 14:30', utc=True)},
                  index=[pd.to_datetime('2005-01-03 14:30')])

print(df['column_datetime'].dtypes)
print(df['column_datetime'].tz_localize('America/New_York').dtypes)
print(df['column_datetime'].tz_convert('America/New_York').dtypes)

Answers to your questions:

1. Why tz_localize does not return datetime64[ns,America/New_York]?

tz_localize localizes the index, not the values of the series (for the latter you need the dt accessor, as you already found out). You can verify this by printing df['column_datetime'].tz_localize('America/New_York').index.dtype which is datetime64[ns, America/New_York]. You printed the types of the values which didn't change in this operation.

This behaviour is clearly described in the documentation of tz_localize:

This operation localizes the Index. To localize the values in a timezone-naive Series, use Series.dt.tz_localize().

2. Why tz_convert says that timestamp is tz-naive when dtypes shows UTC?

Same reason as 1. - it tries to convert the index, which has no timezone. The documentation is not so clear here as for tz_localize.

Upvotes: 1

Related Questions