Abiel
Abiel

Reputation: 5475

Forcing dates to conform to a given frequency in pandas

Suppose we have a monthly time series, possibly with missing months, and upon loading the data into a pandas Series object with DatetimeIndex we wish to make sure each date observation is labeled as an end-of-month date. However, the raw input dates may fall anywhere in the month, so we need to force them to end-of-month observations.

My first thought was to do something like this:

import pandas as pd
pd.DatetimeIndex([datetime(2012,1,20), datetime(2012,7,31)], freq='M')

However, this just leaves the dates as is [2012-01-20,2012-07-31] and does not force them to end-of-month values [2012-01-31,2012-07-31].

My second attempt was:

ix = pd.DatetimeIndex([datetime(2012,1,20), datetime(2012,7,31)], freq='M')
s = pd.Series(np.random.randn(len(ix)), index=ix)
s.asfreq('M')

But this gives:

2012-01-31        NaN
2012-02-29        NaN
2012-03-31        NaN
2012-04-30        NaN
2012-05-31        NaN
2012-06-30        NaN
2012-07-31    0.79173
Freq: M

as under the hood the asfreq function is calling date_range for a DatetimeIndex.

This problem is easily solved if I'm using PeriodIndex instead of DatetimeIndex; however, I need to support some frequencies that are not currently supported by PeriodIndex and as far as I know there is no way to extend pandas with my own Period frequencies.

Upvotes: 1

Views: 1945

Answers (1)

root
root

Reputation: 80456

It's a workaround, but it works without using periodindex:

from pandas.tseries.offsets import *

In [164]: s
Out[164]: 
2012-01-20   -1.266376
2012-07-31   -0.865573

In [165]: s.index=s.index+MonthEnd(n=0)

In [166]: s
Out[166]: 
2012-01-31   -1.266376
2012-07-31   -0.865573

Upvotes: 1

Related Questions