Matthias Foerster
Matthias Foerster

Reputation: 23

Group boolean values in Pandas Dataframe

i have a Dataframe with a random series of True, False in a column:

import pandas as pd
df = pd.DataFrame(data={'A':[True, False, True, True, False, False, True, False, True, False, False]})
df
A
0 True
1 False
2 True
3 True
4 False
5 False
6 True
7 False
8 True
9 False
10 False

and i want this: (Dont know how to explain it with easy words)

A B
0 True 1
1 False 2
2 True 2
3 True 2
4 False 3
5 False 3
6 True 3
7 False 4
8 True 4
9 False 5
10 False 5

I've tried something with the following commands, but without success: df[A].shift() df[A].diff() df[A].eq()

Many thanks for your help. Matthias

Upvotes: 2

Views: 293

Answers (2)

BENY
BENY

Reputation: 323346

A little bit logic with diff

(~df.A.astype(int).diff().ne(-1)).cumsum()+1
Out[234]: 
0     1
1     2
2     2
3     2
4     3
5     3
6     3
7     4
8     4
9     5
10    5
Name: A, dtype: int32

Upvotes: 0

Nk03
Nk03

Reputation: 14949

IIUC, you can try:

df['B'] = (df.A.shift() & ~df.A).cumsum() + 1
# OR df['B'] = (df.A.shift() & ~df.A).cumsum().add(1)

OUTPUT:

        A  B
0    True  1
1   False  2
2    True  2
3    True  2
4   False  3
5   False  3
6    True  3
7   False  4
8    True  4
9   False  5
10  False  5

Upvotes: 2

Related Questions