How to create a year-month series to use as an index in a pandas dataframe?

Question

I'd like to start with the month 2019-01 and then add any number of consequtive months and use that as an index in a pandas dataframe. I've found suggestions that point to using pd.to_timedelta, but I keep bumbing into problems.

Here are the details:

If you start with a date and add 5 periods like this:

import pandas as pd
import numpy as np

date = pd.to_datetime("1st of Jan, 2019")
dates = date+pd.to_timedelta(np.arange(5), 'M')

Then you get:

DatetimeIndex(['2019-01-01 00:00:00', '2019-01-31 10:29:06',
               '2019-03-02 20:58:12', '2019-04-02 07:27:18',
               '2019-05-02 17:56:24'],
              dtype='datetime64[ns]', freq=None)

You can easily remove the day and time parts, and remove duplicates to handle the double 2019-01 like this:

dates = dates.map(lambda x: x.strftime('%Y-%m'))
dates = dates.drop_duplicates()

But as you can see, 2019-02 is missing:

Index(['2019-01', '2019-03', '2019-04', '2019-05'], dtype='object')

What is a better way to do this?

Chris Adams · Accepted Answer

You could use pandas.date_range :

pd.date_range(date, periods=5, freq='M').strftime('%Y-%m')

[out]

Index(['2019-01', '2019-02', '2019-03', '2019-04', '2019-05'], dtype='object')

How to create a year-month series to use as an index in a pandas dataframe?

Answers (2)

Related Questions