DominikS
DominikS

Reputation: 271

Pandas: Group sums of n consecutive elemens in a dataframe column

Let df be a pandas dataframe of the following form:

n   days 
1    9.0
2    4.0
3    5.0 
4    1.0 
5    4.0    
6    1.0
7    7.0
8    3.0

For given N, and every row i>=N I want to sum the values indf.days.iloc[i-N+1:i+1], and write them into a new column, in row i. The result should look like this (e.g., for N = 3):

n   days loc_sum
1    9.0     NaN
2    4.0     NaN
3    5.0    18.0
4    1.0    10.0
5    4.0    10.0
6    1.0     6.0
7    7.0    12.0
8    3.0    11.0

Of course, I could simply loop through all i, and insert df.days.iloc[i-N+1:i+1].sum() for every i.

My question is: Is there a more elegant way, using pandas functionality? Especially for large datasets, looping through the rows seems to be a very slow option.

Upvotes: 2

Views: 2243

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Use rolling with a windows equal to 3 and function sum:

df['loc_sum'] = df['days'].rolling(3).sum()

Output:

   n  days  loc_sum
0  1   9.0      NaN
1  2   4.0      NaN
2  3   5.0     18.0
3  4   1.0     10.0
4  5   4.0     10.0
5  6   1.0      6.0
6  7   7.0     12.0
7  8   3.0     11.0

Upvotes: 2

Related Questions