Reputation: 107
I'm fairly new at Python, and kinda stuck. I need to add a column to a dataframe containing the rolling difference of a column's value. However, I need this count to reset in intervals of 3. Best I could do was
df['Value'].rolling(window=2).apply(lambda x: x[1] - x[0])
But I don't know how to reset the diff count. I've experimented with groupby, but with no results.
The table at hand looks like this
ID Location Year Value
01 EU 2000 42.402
02 EU 2001 44.336
03 EU 2002 46.477
04 US 2000 0.456
06 US 2001 0.438
07 US 2002 0.437
The desired results should look like this:
ID Location Year Value RD
01 EU 2000 42.402 Null
02 EU 2001 44.336 1.934
03 EU 2002 46.477 2.141
04 US 2000 0.456 Null
06 US 2001 0.438 -0.124
07 US 2002 0.437 0.001
Thanks a lot.
Upvotes: 1
Views: 488
Reputation: 863236
Use DataFrame.groupby
by integer division by 3
for general solution with arange
by length of DataFrame
:
idx = np.arange(len(df)) // 3
#if default index
#idx = df.index // 3
s = df.groupby(idx)['Value'].rolling(window=2).apply(lambda x: x[1] - x[0], 'raw=False')
df['RD'] = s.reset_index(level=0, drop=True)
print (df)
ID Location Year Value RD
0 1 EU 2000 42.402 NaN
1 2 EU 2001 44.336 1.934
2 3 EU 2002 46.477 2.141
3 4 US 2000 0.456 NaN
4 6 US 2001 0.438 -0.018
5 7 US 2002 0.437 -0.001
Upvotes: 2