Reputation: 531
I have a dataframe with two columns that represent coordinates and an additional column in a boolean format:
X Y PROB
2 4 False
3 5 False
3 2 False
4 4 True
3 7 True
2 4 False
2 3 False
What I'm trying to do is to select consecutive False and True coordinates and produce 2 new dataframes as follows:
in the case of False
X Y PROB
2 4 1
3 5 1
3 2 1
2 4 2
2 3 2
in the case of True
X Y PROB
4 4 1
3 7 1
Right now my approach is using .isin
but I get KeyError
, some ideas?
Upvotes: 1
Views: 202
Reputation: 323316
Or you can try this (PS: drop column Group by using .drop('Group',1)
)
df['Group']=df.PROB.astype(int).diff().fillna(0).ne(0).cumsum()
df_True=df[df.PROB]
df_False=df[~df.PROB]
df_False.assign(PROB=pd.factorize(df_False.Group)[0]+1)
Out[111]:
X Y PROB Group
0 2 4 1 0
1 3 5 1 0
2 3 2 1 0
5 2 4 2 2
6 2 3 2 2
df_True.assign(PROB=pd.factorize(df_True.Group)[0]+1)
Out[112]:
X Y PROB Group
3 4 4 1 1
4 3 7 1 1
Upvotes: 2
Reputation: 294488
d1 = df.assign(
PROB=df.PROB.diff().fillna(False).cumsum()
).groupby(df.PROB).apply(
lambda d: d.assign(PROB=d.PROB.factorize()[0] + 1)
)
d1
X Y PROB
PROB
False 0 2 4 1
1 3 5 1
2 3 2 1
5 2 4 2
6 2 3 2
True 3 4 4 1
4 3 7 1
d1.xs(True)
X Y PROB
3 4 4 1
4 3 7 1
d1.xs(False)
X Y PROB
0 2 4 1
1 3 5 1
2 3 2 1
5 2 4 2
6 2 3 2
Upvotes: 1