Nan
Nan

Reputation: 27

How to separate the date, hour and timezone info using pandas?

I'm curious about how to use pandas to deal with this sort of info in a .csv file:

2022-08-11 11:50:01 America/Los_Angeles

My goal is to extract the date, hour and minute, and the timezone info for further analysis.

I have tried to lift out the date and time using: df['Date'] = pd.to_datetime(df['datetime']).dt.date

but got an error because of the string at the end. Other than extracting the date and time using specific indices, is there any better and quicker way? Thank you so much.

Upvotes: 0

Views: 465

Answers (1)

Tranbi
Tranbi

Reputation: 12731

pandas cannot handle a datetime column with different timezones. You can start by splitting the datetime and timezone in separate columns:

df[['datetime', 'timezone']] = df['datetime'].str.rsplit(' ', n=1, expand=True)
df['datetime'] = pd.to_datetime(df['datetime']) # this column now has the datetime64[ns] type

Now you are able to do the following:

df['date_only'] = df['datetime'].dt.date

If you want to express all local date/times in America/Los_Angeles time:

df['LA_datetime'] = df.apply(lambda x: x['datetime'].tz_localize(tz=x['timezone']).tz_convert('America/Los_Angeles'), axis = 1)

You can change America/Los_Angeles to the timezone of your liking.

Upvotes: 1

Related Questions