WEITIAN SHI
WEITIAN SHI

Reputation: 11

Converting index to datetime index

My data frame is like this:

YearMonth Number of Visitors 
Jan-91    177400
Feb-91    190600
Mar-91    189200
Apr-91    168000
May-91    161400

I want to convert the index to something like 1991-01. I tried index.to_datetime but received an error:

ValueError: day is out of range for month

My code is as below:

dataC = pd.read_csv('Visitors.csv', index_col='YearMonth', parse_dates=True, dayfirst=True)
dataC.index = dataC.index.to_datetime(dayfirst=True)

Could I manage to achieve my goal?

Upvotes: 1

Views: 328

Answers (1)

jezrael
jezrael

Reputation: 862406

For me first part of solution working for me:

import pandas as pd
from io import StringIO

pd.options.display.max_columns = 20

temp="""YearMonth,Number of Visitors
Jan-91,177400
Feb-91,190600
Mar-91,189200
Apr-91,168000
May-91,161400"""
#after testing replace 'pd.compat.StringIO(temp)' to 'Visitors.csv'
dataC = pd.read_csv(StringIO(temp), index_col='YearMonth', parse_dates=True)
print (dataC)
            Number of Visitors
YearMonth                     
1991-01-01              177400
1991-02-01              190600
1991-03-01              189200
1991-04-01              168000
1991-05-01              161400

print (dataC.index)
DatetimeIndex(['1991-01-01', '1991-02-01', '1991-03-01', '1991-04-01',
               '1991-05-01'],
              dtype='datetime64[ns]', name='YearMonth', freq=None)

If want different format one possible solution is month period by DataFrame.to_period:

dataC = dataC.to_period('m')
print (dataC)
           Number of Visitors
YearMonth                    
1991-01                177400
1991-02                190600
1991-03                189200
1991-04                168000
1991-05                161400

print (dataC.index)
PeriodIndex(['1991-01', '1991-02', '1991-03', '1991-04', '1991-05'], 
             dtype='period[M]', name='YearMonth', freq='M')

In your solution if want convert index to DatemeIndex correct solution is to_datetime with format parameter:

dataC.index = pd.to_datetime(dataC.index, format='%b-%y')

Upvotes: 2

Related Questions