Gaurav Bansal
Gaurav Bansal

Reputation: 5660

Extract date from timestamps of multiple time zones in Pandas

I have a Pandas DataFrame in which I've converted hour to local_hour based on the time_zone column. I now want to extract the date from local_hour as local_date but I get an error saying Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True. How can I do this?

# Create dataframe
import pandas as pd
df = pd.DataFrame({
   'hour': ['2019-01-01 05:00:00', '2019-01-01 07:00:00', '2019-01-01 08:00:00'],
   'time_zone': ['US/Eastern', 'US/Central', 'US/Mountain']
})

# Convert hour to datetime and localize to UTC
df['hour'] = pd.to_datetime(df['hour']).dt.tz_localize('UTC')

# Get local_hour
df['local_hour'] = df.apply(lambda row: row['hour'].tz_convert(row['time_zone']), axis=1)

# Try to get local_date from local_hour
df['local_date'] = pd.to_datetime(df['local_hour'].dt.date)
ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True

Upvotes: 2

Views: 1229

Answers (2)

JLM
JLM

Reputation: 758

The following solution works for me:

df['local_date'] = pd.to_datetime(df['hour'], infer_datetime_format=True, utc=True )
df['local_date'] = df['local_date'].dt.tz_localize('Europe/Amsterdam')

Obviously, the methods can be concatenated, but separated for readibility.

  • utc = True : Return UTC DatetimeIndex which is the needed one to allow the tz_localize() method.
  • infer_datetime_format = True : another useful param which attempts to infer the format of the datetime strings.

Reference in pandas 0.19

Upvotes: 0

Code Different
Code Different

Reputation: 93161

It seems like the problem only happens when your local_hour column contains different timezones. Had everything been in the same timezone, this would have worked:

# Work: the whole column in a single timezone
df['local_hour'] = df['hour'].dt.tz_convert('America/New_York')
df['local_hour'].dt.date

# Not work: column contains a mix of timezones
df['local_hour'] = df.apply(lambda row: row['hour'].tz_convert(row['time_zone']), axis=1)
df['local_hour'].dt.date

ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True

I'd suggest you file an issue with the pandas team. Mean while, you can use apply which is essentially a loop:

tmp = df['local_hour'].apply(lambda t: pd.Series({
    'date': t.date(),
    'hour': t.hour
}))
df = pd.concat([df, tmp], axis=1)

Upvotes: 1

Related Questions