Eric D. Brown D.Sc.
Eric D. Brown D.Sc.

Reputation: 1956

Find rolling 52 week high on daily stock market data in pandas

This seems like a simple question (and answer) but I'm having trouble with it.

The issue:

I have a pandas dataframe full of OHLC data. I want to find the rolling 52 week high throughout the dataframe.

My dataset is from yahoo. You can pull the same data down with the folllowing code to get daily data:

import pandas.io.data as web
df = web.DataReader('SPX', 'yahoo', start, end)

A tail of the data gives the output below:

                 Open        High         Low       Close     Volume  
Date                                                                    
2016-07-15  216.779999  217.009995  215.309998  215.830002  107155400   
2016-07-18  215.970001  216.600006  215.669998  216.410004   58725900   
2016-07-19  215.919998  216.229996  215.630005  216.190002   54345700   
2016-07-20  216.190002  217.369995  216.190002  217.089996   58159500   
2016-07-21  216.960007  217.220001  215.750000  216.270004   66070000 

To get the 52 week high (rolling), I can run the following:

df["52weekhigh"] = pd.rolling_max(df.High, window=200, min_periods=1)

I get the following (some col:

                 High  52weekhigh
Date                              
2016-07-15  217.009995  217.009995
2016-07-18  216.600006  217.009995
2016-07-19  216.229996  217.009995
2016-07-20  217.369995  217.369995
2016-07-21  217.220001  217.369995

This gives me an values for the 52 week highs as new highs come in, but I'm not a fan of using 200 here. Should it be 200 or 201 or 220 (there are "approximately" 200 trading days in the year)?

I could resample the data to weekly to get the values, but then i can't easily get back to my original daily data (or can I?).

So...here's the question:

Is there a way to run rolling_max on pandas dataframes and set the window to '52 weeks' or something similar? If not, can anyone think of a better approach to this than the above?

Upvotes: 2

Views: 8746

Answers (1)

unutbu
unutbu

Reputation: 879471

If your data has business-day frequency then there should be roughly 5 rows per week. So 52 weeks would roughly correspond to window=52*5.

Of course, there might be a few other days missing due to holidays. To be more accurate, you could use asfreq('D') to change the frequency from business days to actual days. Then you could use a rolling window of size 52*7:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
np.random.seed(2016)

N = 1000
index = pd.date_range('2000-1-1', periods=N, freq='B')
data = (np.random.random((N, 1))-0.5).cumsum(axis=0)
df = pd.DataFrame(data, index=index, columns=['price'])
# result = pd.rolling_max(df.asfreq('D'), window=52*7, min_periods=1)   # for older versions of Pandas
result = df.asfreq('D').rolling(window=52*7, min_periods=1).max()
result = result.rename(columns={'price':'rolling max'})

ax = df.plot()
result.plot(ax=ax)
plt.show()

enter image description here

Upvotes: 9

Related Questions