Reputation: 21
I have a dataset where I have 2 columns in a data frame - Date in YYYY-MM-DD format and another column with Hour in format 0100 (for 1am) until 2300 (for 12pm).
Date Hour
2017-01-01 0200
2017-01-01 0400
etc In order to get it ready for Time series mode, I want to convert these into datetime objects and concatenate these columns. Example output desired: 2017-01-01 01:00:00, etc
I have tried df['Date'] = pd.to_datetime(df['Date'])
and converted this into datetime object, But I'm struggling with the Hour column. Please help
Upvotes: 2
Views: 2876
Reputation: 164673
This is one way. The trick is to note that pd.to_datetime
is actually quite flexible: it accepts strings of the format "YYYY-MM-DD HHMM".
I assume here that your Hour is given as a string (otherwise leading zeros are not possible).
import pandas as pd
df = pd.DataFrame({'Date': ['2017-01-01', '2017-01-01'],
'Hour': ['0200', '0400']})
# as per @COLDSPEED's suggestion
df['DateTime'] = pd.to_datetime(df['Date'] + ' ' + df['Hour'])
print(df)
# Date Hour DateTime
# 0 2017-01-01 0200 2017-01-01 02:00:00
# 1 2017-01-01 0400 2017-01-01 04:00:00
print(df.dtypes)
# Date object
# Hour object
# DateTime datetime64[ns]
# dtype: object
Previous version with pd.DataFrame.apply
is possible but inefficient:
df['DateTime'] = df.apply(lambda x: x['Date'] + ' ' + x['Hour'], axis=1)\
.apply(pd.to_datetime)
Upvotes: 3