Jane Borges
Jane Borges

Reputation: 592

Join the columns for dates

I have the following dataframe with date information:

      import numpy as np
      import pandas as pd

      df_Date = pd.DataFrame({'Day' : ['2', '19', '22', '15', '16'], 
                              'Month' : ['2', '8', '1', '11', '10'], 
                              'Year' : ['2010', '2010', '2010', '2010', '2010'], 
                              'Hour' : ['1', '2', '3', '4', '5'],
                              'Minute' : ['59', '55', '33', '45', '5'],
                              'Second' : ['16', '17', '18', '19', '20']                       
                             })

I would like to generate, just one column, containing the complete date (Day, Month, Year, Hour, Minute and Secund).

I tried to implement the following code:

     df_Date['Day'] = df_Date['Day'].astype(int)
     df_Date['Month'] = df_Date['Month'].astype(int)
     df_Date['Year'] = df_Date['Year'].astype(int)

     df_Date['New_Column_Date'] = pd.to_datetime(df_Date.Year*10000 + df_Date.Month*100 + 
                                                 df_Date.Day, format='%Y%m%d')

This code is working. However, it is not complete, the hours minutes and second are missing.

My exit is like this:

        Day   Month     Year    Hour    Minute  Second  New_Column_Date
         2      2       2010    1         59    16          2010-02-02
         19     8       2010    2         55    17          2010-08-19
         22     1       2010    3         33    18          2010-01-22
         15     11      2010    4         45    19          2010-11-15
         16     10      2010    5          5    20          2010-10-16

I would like the output to be:

        Day   Month     Year    Hour    Minute  Second  New_Column_Date
         2      2       2010    1         59    16          2010-02-02 01:59:16
         19     8       2010    2         55    17          2010-08-19 02:55:17
         22     1       2010    3         33    18          2010-01-22 03:33:18
         15     11      2010    4         45    19          2010-11-15 04:45:19
         16     10      2010    5          5    20          2010-10-16 05:05:20

Upvotes: 1

Views: 51

Answers (1)

cs95
cs95

Reputation: 403208

As long as your headers are named as shown in your post, you can delegate all the heavy lifting to pd.to_datetime:

pd.to_datetime(df[['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second']])

0   2010-02-02 01:59:16
1   2010-08-19 02:55:17
2   2010-01-22 03:33:18
3   2010-11-15 04:45:19
4   2010-10-16 05:05:20
dtype: datetime64[ns]

The precondition is that you name your columns "Year", "Month", "Day", ... etc as shown above. The order of the columns isn't even that important. The names are extremely important.

Upvotes: 1

Related Questions