Reputation: 889
I have this code:
d = {'col1': [1, 2, 3, 4, 5, 6], 'col2': [7, 8, 9, 10, 11, 12], 'is_new_group': [True, False, False, True, False, False] }
pd.DataFrame(d)
I would like to divide the data into groups.
Every group will start at the index of the first row where is_new_group
is True
, and end when a new is_new_group
is True
In this case, it should divide the data into 2 groups:
The first 3 rows, and the last 3 rows:
I found the `pd.groupby```.
according to the documentation, the by
parameter: mapping, function, label, or list of labels
.
But this is a different situation.
How can group the values according to the demands?
Upvotes: 2
Views: 812
Reputation: 10010
Use cumsum
to detect every new True
. In Python, True
and False
get converted to 1
and 0
respectively when performing calculations on them.
df.groupby(df.is_new_group.cumsum())
would do what you want.
Upvotes: 3
Reputation: 16172
If True
is the indicator of a new group, you can check where that is true and use cumsum
to create your group labels. That can be used to group on.
df.groupby(df.is_new_group.eq(True).cumsum())
Upvotes: 3