Reputation: 592
I have the following dataframe with date information:
import numpy as np
import pandas as pd
df_Date = pd.DataFrame({'Day' : ['2', '19', '22', '15', '16'],
'Month' : ['2', '8', '1', '11', '10'],
'Year' : ['2010', '2010', '2010', '2010', '2010'],
'Hour' : ['1', '2', '3', '4', '5'],
'Minute' : ['59', '55', '33', '45', '5'],
'Second' : ['16', '17', '18', '19', '20']
})
I would like to generate, just one column, containing the complete date (Day, Month, Year, Hour, Minute and Secund).
I tried to implement the following code:
df_Date['Day'] = df_Date['Day'].astype(int)
df_Date['Month'] = df_Date['Month'].astype(int)
df_Date['Year'] = df_Date['Year'].astype(int)
df_Date['New_Column_Date'] = pd.to_datetime(df_Date.Year*10000 + df_Date.Month*100 +
df_Date.Day, format='%Y%m%d')
This code is working. However, it is not complete, the hours minutes and second are missing.
My exit is like this:
Day Month Year Hour Minute Second New_Column_Date
2 2 2010 1 59 16 2010-02-02
19 8 2010 2 55 17 2010-08-19
22 1 2010 3 33 18 2010-01-22
15 11 2010 4 45 19 2010-11-15
16 10 2010 5 5 20 2010-10-16
I would like the output to be:
Day Month Year Hour Minute Second New_Column_Date
2 2 2010 1 59 16 2010-02-02 01:59:16
19 8 2010 2 55 17 2010-08-19 02:55:17
22 1 2010 3 33 18 2010-01-22 03:33:18
15 11 2010 4 45 19 2010-11-15 04:45:19
16 10 2010 5 5 20 2010-10-16 05:05:20
Upvotes: 1
Views: 51
Reputation: 403208
As long as your headers are named as shown in your post, you can delegate all the heavy lifting to pd.to_datetime
:
pd.to_datetime(df[['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second']])
0 2010-02-02 01:59:16
1 2010-08-19 02:55:17
2 2010-01-22 03:33:18
3 2010-11-15 04:45:19
4 2010-10-16 05:05:20
dtype: datetime64[ns]
The precondition is that you name your columns "Year", "Month", "Day", ... etc as shown above. The order of the columns isn't even that important. The names are extremely important.
Upvotes: 1