code_mama
code_mama

Reputation: 11

Use recurring row values or booleans to define boundaries of pandas groupby

I have a pandas DataFrame that has a boolean column to indicate whether a given row is actually a header (vs. a value). I want to be able to make pandas groupby objects out of the header row and all subsequent rows before the next header.

Imagine a DataFrame with the following column:

pd.Series([True, False, False, False, True, False False])

I want to run a groupby statement that will separate this DataFrame into two groups: [True, False, False, False] and [True, False False]. How can I do this?

Upvotes: 0

Views: 114

Answers (1)

cs95
cs95

Reputation: 402962

Perform a cumsum on column B, then use this to group:

df.groupby(df['your_col'].cumsum())

Here's an example of what that looks like:

df

   A      B
0  a   True
1  b  False
2  c  False
3  d  False
4  e   True
5  f  False
6  g  False

df.groupby(df['B'].cumsum())['B'].agg(list)

B
1    [True, False, False, False]
2           [True, False, False]
Name: B, dtype: object

Upvotes: -1

Related Questions