Reputation: 2102
Let's say I have financial data in a pandas.Series
, called fin_series.
Here's a peek at fin_series
.
In [565]: fin_series
Out[565]:
Date
2008-05-16 1000.000000
2008-05-19 1001.651747
2008-05-20 1004.137434
...
2014-12-22 1158.085200
2014-12-23 1150.139126
2014-12-24 1148.934665
Name: Close, Length: 1665
I'm interested in looking at the quarterly endpoints of the data. However, not all financial trading days fall exactly on the 'end of the quarter.'
For example:
In [566]: fin_series.asfreq('q')
Out[566]:
2008-06-30 976.169624
2008-09-30 819.518923
2008-12-31 760.429261
...
2009-06-30 795.768956
2009-09-30 870.467121
2009-12-31 886.329978
...
2011-09-30 963.304679
2011-12-31 NaN
2012-03-31 NaN
....
2012-09-30 NaN
2012-12-31 1095.757137
2013-03-31 NaN
2013-06-30 NaN
...
2014-03-31 1138.548881
2014-06-30 1168.248194
2014-09-30 1147.000073
Freq: Q-DEC, Name: Close, dtype: float64
Here's a little function that accomplishes what I'd like, along with the desired end result.
def bmg_qt_asfreq(series):
ind = series[1:].index.quarter != series[:-1].index.quarter
ind = numpy.append(ind, True)
return tmp[ind]
which gives me:
In [15]: bmg_asfreq(tmp)
Out[15]:
Date
2008-06-30 976.169425
2008-09-30 819.517607
2008-12-31 760.428770
...
2011-09-30 963.252831
2011-12-30 999.742132
2012-03-30 1049.848583
...
2012-09-28 1086.689824
2012-12-31 1093.943357
2013-03-28 1117.111859
Name: Close, dtype: float64
Note that I'm preserving the dates of the "closest prior price," instead of simply using pandas.asfreq(freq = 'q', method = 'ffill')
, as the preservations of dates that exist within the original Series.Index
is crucial.
This seems like a silly problem that many people have had and must be addressed by all of the pandas
time manipulation functionality, but I can't figure out how to do it with resample
or asfreq.
Anyone who could show me the builtin pandas
functionality to accomplish this would be greatly appreciated.
Regards,
Upvotes: 0
Views: 1856
Reputation: 2418
Assuming the input is a dataframe Series
, first do
import pandas as pd
fin_series.resample("q",pd.Series.last_valid_index)
to get a series with the last non-NA index for each quarter. Then
fin_series.resample("q","last")
for the last non-NA value. You can then join these together. As you suggested in your comment:
fin_series[fin_series.resample("q",pd.Series.last_valid_index)]
Upvotes: 1