Reputation: 271
Let df
be a pandas dataframe of the following form:
n days
1 9.0
2 4.0
3 5.0
4 1.0
5 4.0
6 1.0
7 7.0
8 3.0
For given N
, and every row i>=N
I want to sum the values indf.days.iloc[i-N+1:i+1]
, and write them into a new column, in row i
.
The result should look like this (e.g., for N = 3
):
n days loc_sum
1 9.0 NaN
2 4.0 NaN
3 5.0 18.0
4 1.0 10.0
5 4.0 10.0
6 1.0 6.0
7 7.0 12.0
8 3.0 11.0
Of course, I could simply loop through all i
, and insert df.days.iloc[i-N+1:i+1].sum()
for every i
.
My question is: Is there a more elegant way, using pandas
functionality? Especially for large datasets, looping through the rows seems to be a very slow option.
Upvotes: 2
Views: 2243
Reputation: 153460
Use rolling
with a windows equal to 3 and function sum
:
df['loc_sum'] = df['days'].rolling(3).sum()
Output:
n days loc_sum
0 1 9.0 NaN
1 2 4.0 NaN
2 3 5.0 18.0
3 4 1.0 10.0
4 5 4.0 10.0
5 6 1.0 6.0
6 7 7.0 12.0
7 8 3.0 11.0
Upvotes: 2