Reputation: 587
I'm using pandas in Python and I have an issue to select some data. I have DataFrame with float values, and I would like to create a column which contains the maximum (or minimum) of the n previous rows of a column, and set to 0 for the n first rows, here's an example of the result I would like to have:
df_test = pd.DataFrame({'a':[2,7,2,0,-1, 19, -52, 2]})
df_test['result_i_want_with_n=3'] = [0, 0, 0, 7, 7, 2, 19, 19]
print(df_test)
a result_i_want_with_n=3
0 2 0
1 7 0
2 2 0
3 0 7
4 -1 7
5 19 2
6 -52 19
7 2 19
I managed to get this result using a while, but I would like to program it in a more "pandas" way to gain computation speed.
Thanks
Upvotes: 8
Views: 7784
Reputation: 2110
Rolling is your friend here. You need to shift by one row in order to get your exact result, otherwise your first value will be in the third row.
df_test.rolling(window=3).max().shift(1).fillna(0)
0 0.0
1 0.0
2 0.0
3 7.0
4 7.0
5 2.0
6 19.0
7 19.0
Upvotes: 11