Reputation: 301
d = pd.DataFrame({'a':[7,6,3,4,8], 'b':['c','c','d','d','c']})
d.groupby('b')['a'].diff()
Gives me
0 NaN
1 -1.0
2 NaN
3 1.0
4 2.0
What I'd need
0 NaN
1 -1.0
2 NaN
3 1.0
4 NaN
Which is difference between only successive values within group, so when a group appears after another group , it's previous values are ignored.
In my example last c
value is a new c
group.
Upvotes: 0
Views: 105
Reputation: 76967
You would need to groupby
on consecutive segments
In [1055]: d.groupby((d.b != d.b.shift()).cumsum())['a'].diff()
Out[1055]:
0 NaN
1 -1.0
2 NaN
3 1.0
4 NaN
Name: a, dtype: float64
Details
In [1056]: (d.b != d.b.shift()).cumsum()
Out[1056]:
0 1
1 1
2 2
3 2
4 3
Name: b, dtype: int32
Upvotes: 2