roflolmaomg1
roflolmaomg1

Reputation: 1

Pandas - Uniquely group the next n rows where n is defined by the column value

Let's say I have a pandas Series that looks like this:

0   1
1   1
2   1
3   2
4   2
5   3
6   3
7   3
8   2
9   2
10  2
11  2
12  1

A value of 1 means that this row is its own group. The first time you see a value of 2, it means that this row and the next are in a group. The first time you see a value of 3, it means that this row and the following two rows are in a group. And so on and so forth.

So for the above example I would expect it to be grouped as such:

0   1
------
1   1
------
2   1
------
3   2
4   2
------
5   3
6   3
7   3
------
8   2
9   2
------
10  2
11  2
------
12  1

I know this can be done with iterrows and keeping count of the rows, but I am curious if there is a way better way to do something similar in pandas. Thanks!

Upvotes: 0

Views: 37

Answers (1)

BENY
BENY

Reputation: 323306

Use the diff with cumsum create the groupby key

key = s.diff().ne(0).cumsum()

key = s.groupby([s.groupby(key).cumcount()//s,key],sort=False).ngroup()
d = {x : y for x , y in s.groupby(key)}
d
{0: 0    1
Name: s, dtype: int64, 1: 1    1
Name: s, dtype: int64, 2: 2    1
Name: s, dtype: int64, 3: 3    2
4    2
Name: s, dtype: int64, 4: 5    3
6    3
7    3
Name: s, dtype: int64, 5: 8    2
9    2
Name: s, dtype: int64, 6: 10    2
11    2
Name: s, dtype: int64, 7: 12    1
Name: s, dtype: int64}

#Key show as below

key
0     0
1     1
2     2
3     3
4     3
5     4
6     4
7     4
8     5
9     5
10    6
11    6
12    7
dtype: int64

Upvotes: 1

Related Questions