Reputation: 1
Let's say I have a pandas Series that looks like this:
0 1
1 1
2 1
3 2
4 2
5 3
6 3
7 3
8 2
9 2
10 2
11 2
12 1
A value of 1 means that this row is its own group. The first time you see a value of 2, it means that this row and the next are in a group. The first time you see a value of 3, it means that this row and the following two rows are in a group. And so on and so forth.
So for the above example I would expect it to be grouped as such:
0 1
------
1 1
------
2 1
------
3 2
4 2
------
5 3
6 3
7 3
------
8 2
9 2
------
10 2
11 2
------
12 1
I know this can be done with iterrows and keeping count of the rows, but I am curious if there is a way better way to do something similar in pandas. Thanks!
Upvotes: 0
Views: 37
Reputation: 323306
Use the diff
with cumsum
create the groupby
key
key = s.diff().ne(0).cumsum()
key = s.groupby([s.groupby(key).cumcount()//s,key],sort=False).ngroup()
d = {x : y for x , y in s.groupby(key)}
d
{0: 0 1
Name: s, dtype: int64, 1: 1 1
Name: s, dtype: int64, 2: 2 1
Name: s, dtype: int64, 3: 3 2
4 2
Name: s, dtype: int64, 4: 5 3
6 3
7 3
Name: s, dtype: int64, 5: 8 2
9 2
Name: s, dtype: int64, 6: 10 2
11 2
Name: s, dtype: int64, 7: 12 1
Name: s, dtype: int64}
#Key show as below
key
0 0
1 1
2 2
3 3
4 3
5 4
6 4
7 4
8 5
9 5
10 6
11 6
12 7
dtype: int64
Upvotes: 1