Reputation: 53
Data:
vix s_0000 s_0001 s_0002 s_0003 s_0004
Date
1990-01-02 17.24 NaN 13.389421 NaN NaN NaN
1990-01-03 18.19 NaN 13.588601 NaN NaN NaN
1990-01-04 19.22 NaN 13.610730 NaN NaN NaN
1990-01-05 20.11 NaN 13.477942 NaN NaN NaN
1990-01-08 20.26 NaN 13.477942 NaN NaN NaN
... ... ... ... ... ... ...
2020-12-23 23.31 116.631310 22.171579 15.890000 16.577030 9.00
2020-12-24 21.53 116.641243 21.912146 15.660000 16.606722 8.77
2020-12-28 21.70 117.158287 22.191536 16.059999 16.200956 8.93
2020-12-29 23.08 116.561714 21.991972 15.860000 16.745275 8.80
2020-12-30 22.77 116.720795 22.899990 16.150000 17.932884 8.60
I would like to add a new column "HVM" showing a 12-month rolling median (n=12) of vix. For instance, the HVM for January 1991 will be the median of Jan 1990 to Dec 1990. The HVM for February 1991 will be the median of Feb 1990 to Jan 1991. There will be no HVM for Year 1990. I've only managed the code below:
import pandas as pd
md = pd.read_csv('ADJ_CLOSE.zip')
md.Date = pd.to_datetime(md.Date)
md.set_index('Date', inplace=True)
md['HVM'] = md.groupby(md.Date.dt.year)['vix'].transform('median')
However this code applies the 12m median starting from 1990, which gives the same HVM for the entire year, instead of calculating from the past 12 months. Any help would be appreciated!
Upvotes: 0
Views: 152
Reputation: 120499
The problem is 12M
is considered as a non-fixed frequency. The best you can do is to compute the median over 365 days:
md = pd.read_csv('ADJ_CLOSE.zip', index_col='Date', parse_dates=['Date'])
md['HVM'] = md.rolling('365D')['vix'].median()
Upvotes: 1