Reputation: 23
i have a Dataframe with a random series of True, False in a column:
import pandas as pd
df = pd.DataFrame(data={'A':[True, False, True, True, False, False, True, False, True, False, False]})
df
A | |
---|---|
0 | True |
1 | False |
2 | True |
3 | True |
4 | False |
5 | False |
6 | True |
7 | False |
8 | True |
9 | False |
10 | False |
and i want this: (Dont know how to explain it with easy words)
A | B | |
---|---|---|
0 | True | 1 |
1 | False | 2 |
2 | True | 2 |
3 | True | 2 |
4 | False | 3 |
5 | False | 3 |
6 | True | 3 |
7 | False | 4 |
8 | True | 4 |
9 | False | 5 |
10 | False | 5 |
I've tried something with the following commands, but without success:
df[A].shift()
df[A].diff()
df[A].eq()
Many thanks for your help. Matthias
Upvotes: 2
Views: 293
Reputation: 323346
A little bit logic with diff
(~df.A.astype(int).diff().ne(-1)).cumsum()+1
Out[234]:
0 1
1 2
2 2
3 2
4 3
5 3
6 3
7 4
8 4
9 5
10 5
Name: A, dtype: int32
Upvotes: 0
Reputation: 14949
IIUC, you can try:
df['B'] = (df.A.shift() & ~df.A).cumsum() + 1
# OR df['B'] = (df.A.shift() & ~df.A).cumsum().add(1)
OUTPUT:
A B
0 True 1
1 False 2
2 True 2
3 True 2
4 False 3
5 False 3
6 True 3
7 False 4
8 True 4
9 False 5
10 False 5
Upvotes: 2