Python Dataframe Find n rows rolling slope without for loop

Question

I am trying to access n rows of the dataframe and compute mean. The objective is no to use for loop. Because, my df has 30k rows and it may slow it. So, the objective is to use a pandas function to compute n rows mean.

My code:

from scipy import stats 
dfx = pd.DataFrame({'A':[10,20,15,30,1.5,0.6,7,0.8,90,10]}) 
n=2 ## n to cover n samples 
cl_id = dfx.columns.tolist().index('A')  ### cl_id for index number of the column for using in .iloc 
l1=['NaN']*n+[stats.linregress(dfx.iloc[x+1-n:x+1,cl_id].tolist(),[1,2])[0] for x in np.arange(n,len(dfx))]
dfx['slope'] = l1
print(dfx)
      A      slope
0  10.0        NaN
1  20.0        NaN  #stats.linregress([20,10],[1,2])[0] is missing here. Why?
2  15.0       -0.2  #stats.linregress([15,20],[1,2])[0] = 0.2
3  30.0  0.0666667  #stats.linregress([30,15],[1,2])[0] = 0.06667
4   1.5 -0.0350877
5   0.6   -1.11111
6   7.0    0.15625
7   0.8   -0.16129
8  90.0  0.0112108
9  10.0    -0.0125

Everything working fine. Is there a pythonic way of doing it? Like using rolling() function etc.

Mohsin hasan · Accepted Answer

n = 2
dfx.A.rolling(n).apply(lambda x: stats.linregress(x, x.index+1)[0], raw=False)

Output:

0         NaN
1    0.100000
2   -0.200000
3    0.066667
4   -0.035088
5   -1.111111
6    0.156250
7   -0.161290
8    0.011211
9   -0.012500

Python Dataframe Find n rows rolling slope without for loop

Answers (1)

Related Questions