user20977195
user20977195

Reputation: 31

How to convert year, month, day, hour/minute columns into a single datetime column?

I have the following data format with different columns for year, month, day, and hour_minute (the first two digits are hour and the last two digits are minutes). How do I create a new column in datetime format by combining all of these existing columns?

YEAR MONTH DAY HOUR_MINUTE
2015 1 15 0010
2015 1 2 0020
2015 1 15 0045
2015 1 15 2110
2015 10 21 2359

I have tried the following but have no luck. Thank you for your advise your advise.

df["new_column"]= pd.to_datetime(df[["YEAR", "MONTH", "DAY","HOUR_MINUTE"]])

Upvotes: 2

Views: 943

Answers (3)

Laurent B.
Laurent B.

Reputation: 2263

Suggested script

import pandas as pd

df1 = pd.DataFrame({'YEAR': ['2015', '2015', '2015', '2015', '2015'],
                   'MONTH': ['1', '1', '1', '1', '10'],
                   'DAY': ['15', '2', '15', '15', '21'],
                   'HOUR_MINUTE': ['0010', '0020', '0045', '2110', '2359']
                   })

df1['FMT'] = df1.agg('-'.join(['{0[%s]}'%c for c in df1.columns]).format, axis=1)
df1['FMT'] = pd.to_datetime(df1['FMT'], format='%Y-%m-%d-%H%M')

print(df1)

Output

   YEAR MONTH DAY HOUR_MINUTE                 FMT
0  2015     1  15        0010 2015-01-15 00:10:00
1  2015     1   2        0020 2015-01-02 00:20:00
2  2015     1  15        0045 2015-01-15 00:45:00
3  2015     1  15        2110 2015-01-15 21:10:00
4  2015    10  21        2359 2015-10-21 23:59:00

Upvotes: 0

Surjit Samra
Surjit Samra

Reputation: 4662

You can apply on entire df if you have only year,month and hour_minute columns like this

df.apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
Out[198]: 
0   2015-11-05 00:10:00
1   2015-01-20 02:00:00
2   2015-11-05 04:05:00
3   2015-11-05 21:10:00
4   2015-10-21 23:59:00
dtype: datetime64[ns]

if there are other columns as well then just select the required columns then apply

df[['YEAR', 'MONTH', 'DAY', 'HOUR_MINUTE']].apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
Out[201]: 
0   2015-11-05 00:10:00
1   2015-01-20 02:00:00
2   2015-11-05 04:05:00
3   2015-11-05 21:10:00
4   2015-10-21 23:59:00
dtype: datetime64[ns]

if you want new_column to be assigned to df then

df['new_column'] = df[['YEAR', 'MONTH', 'DAY', 'HOUR_MINUTE']].apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
df
Out[205]: 
   YEAR  MONTH  DAY HOUR_MINUTE          new_column
0  2015      1   15        0010 2015-11-05 00:10:00
1  2015      1    2        0020 2015-01-20 02:00:00
2  2015      1   15          45 2015-11-05 04:05:00
3  2015      1   15        2110 2015-11-05 21:10:00
4  2015     10   21        2359 2015-10-21 23:59:00

Upvotes: 0

Jason Baker
Jason Baker

Reputation: 3706

You need to split HOUR_MINUTE column to HOUR and MINUTE

df["HOUR"] = df["HOUR_MINUTE"].str[0:2]
df["MINUTE"] = df.pop("HOUR_MINUTE").str[2:4]
df["new_column"] = pd.to_datetime(df[["YEAR", "MONTH", "DAY", "HOUR", "MINUTE"]], format="%Y-%m-%d %H:%M")
print(df)

Output:

   YEAR  MONTH  DAY HOUR MINUTE          new_column
0  2015      1   15   00     10 2015-01-15 00:10:00
1  2015      1    2   00     20 2015-01-02 00:20:00
2  2015      1   15   00     45 2015-01-15 00:45:00
3  2015      1   15   21     10 2015-01-15 21:10:00
4  2015     10   21   23     59 2015-10-21 23:59:00

Upvotes: 1

Related Questions