benjaminmgross
benjaminmgross

Reputation: 2102

pandas asfreq returns NaN if exact date DNE

Let's say I have financial data in a pandas.Series, called fin_series.

Here's a peek at fin_series.

In [565]: fin_series
Out[565]: 
Date
2008-05-16    1000.000000
2008-05-19    1001.651747
2008-05-20    1004.137434
...
2014-12-22    1158.085200
2014-12-23    1150.139126
2014-12-24    1148.934665
Name: Close, Length: 1665

I'm interested in looking at the quarterly endpoints of the data. However, not all financial trading days fall exactly on the 'end of the quarter.'

For example:

In [566]: fin_series.asfreq('q')
Out[566]: 
2008-06-30     976.169624
2008-09-30     819.518923
2008-12-31     760.429261
...
2009-06-30     795.768956
2009-09-30     870.467121
2009-12-31     886.329978
...
2011-09-30     963.304679
2011-12-31            NaN
2012-03-31            NaN
....
2012-09-30            NaN
2012-12-31    1095.757137
2013-03-31            NaN
2013-06-30            NaN
...
2014-03-31    1138.548881
2014-06-30    1168.248194
2014-09-30    1147.000073
Freq: Q-DEC, Name: Close, dtype: float64

Here's a little function that accomplishes what I'd like, along with the desired end result.

def bmg_qt_asfreq(series):
    ind = series[1:].index.quarter != series[:-1].index.quarter
    ind = numpy.append(ind, True)
    return tmp[ind]

which gives me:

In [15]: bmg_asfreq(tmp)
Out[15]: 
Date
2008-06-30     976.169425
2008-09-30     819.517607
2008-12-31     760.428770
... 
2011-09-30     963.252831
2011-12-30     999.742132
2012-03-30    1049.848583
...
2012-09-28    1086.689824
2012-12-31    1093.943357
2013-03-28    1117.111859
Name: Close, dtype: float64

Note that I'm preserving the dates of the "closest prior price," instead of simply using pandas.asfreq(freq = 'q', method = 'ffill'), as the preservations of dates that exist within the original Series.Index is crucial.

This seems like a silly problem that many people have had and must be addressed by all of the pandas time manipulation functionality, but I can't figure out how to do it with resample or asfreq.

Anyone who could show me the builtin pandas functionality to accomplish this would be greatly appreciated.

Regards,

Upvotes: 0

Views: 1856

Answers (2)

thistleknot
thistleknot

Reputation: 1158

df.asfreq('d').interpolate().asfreq('q')

Upvotes: 0

Luciano
Luciano

Reputation: 2418

Assuming the input is a dataframe Series , first do

import pandas as pd
fin_series.resample("q",pd.Series.last_valid_index)

to get a series with the last non-NA index for each quarter. Then

fin_series.resample("q","last")

for the last non-NA value. You can then join these together. As you suggested in your comment:

fin_series[fin_series.resample("q",pd.Series.last_valid_index)]

Upvotes: 1

Related Questions