Kenny Smith
Kenny Smith

Reputation: 889

pandas group dataframe until a specific value

I have this code:

d = {'col1': [1, 2, 3, 4, 5, 6], 'col2': [7, 8, 9, 10, 11, 12], 'is_new_group': [True, False, False, True, False, False] }
pd.DataFrame(d)

enter image description here

I would like to divide the data into groups. Every group will start at the index of the first row where is_new_group is True, and end when a new is_new_group is True

In this case, it should divide the data into 2 groups: The first 3 rows, and the last 3 rows: enter image description here enter image description here

I found the `pd.groupby```.

according to the documentation, the by parameter: mapping, function, label, or list of labels.

But this is a different situation.

How can group the values according to the demands?

Upvotes: 2

Views: 812

Answers (2)

Gwang-Jin Kim
Gwang-Jin Kim

Reputation: 10010

Use cumsum to detect every new True. In Python, True and False get converted to 1 and 0 respectively when performing calculations on them.

df.groupby(df.is_new_group.cumsum())

would do what you want.

Upvotes: 3

Chris
Chris

Reputation: 16172

If True is the indicator of a new group, you can check where that is true and use cumsum to create your group labels. That can be used to group on.

df.groupby(df.is_new_group.eq(True).cumsum())

Upvotes: 3

Related Questions