Ravi
Ravi

Reputation: 3303

Date and Month Mixup in Pandas

In Pandas, I am trying to format a date column from String to proper date so that I can export it to ElasticSearch. However, date and month are getting mixed up. I have given an example below.

df = pd.DataFrame({'Date':['12/03/2020 0:00', '11/02/2019 0:00', '10/01/2020 0:00'], 
                'Event':['Music', 'Poetry', 'Theatre'], 
                'Cost':[10000, 5000, 15000]}) 

Date is entered in dd/mm/YYYY format.

df['Date1'] = df['Date'].astype('datetime64[ns]')
df['Year'] = pd.DatetimeIndex(df['Date']).year  
df['Month'] = pd.DatetimeIndex(df['Date1']).month
df['Day'] = pd.DatetimeIndex(df['Date1']).day
df

This results in the following data frame where the date and month are interchanged.Year is extracted correct.

          Date      Event   Cost    Date1       Year    Month   Day
0   12/03/2020 0:00 Music   10000   2020-12-03  2020    12  3
1   11/02/2019 0:00 Poetry  5000    2019-11-02  2019    11  2
2   10/01/2020 0:00 Theatre 15000   2020-10-01  2020    10  1

Can someone provide inputs on how to format the date column in an appropriate way? Thanks

Upvotes: 0

Views: 90

Answers (1)

AKX
AKX

Reputation: 168863

You'll want to use pd.to_datetime() to convert the data to real datetimes first:

df['Date'] = pd.to_datetime(df['Date'])

Happily, the default parameters seem to work for parsing your example data:

>>> df
        Date    Event   Cost
0 2020-12-03    Music  10000
1 2019-11-02   Poetry   5000
2 2020-10-01  Theatre  15000

If you need the separate d/m/y columns, you can access the series' dt property instead of converting via a DatetimeIndex:

>>> df['Year'] = df['Date'].dt.year
>>> # ... etc ...
>>> df
        Date    Event   Cost  Year
0 2020-12-03    Music  10000  2020
1 2019-11-02   Poetry   5000  2019
2 2020-10-01  Theatre  15000  2020

Upvotes: 1

Related Questions