Jonathan Pacheco
Jonathan Pacheco

Reputation: 531

Select columns data sets with a boolean method with pandas

I have a dataframe with two columns that represent coordinates and an additional column in a boolean format:

X    Y    PROB
2    4    False
3    5    False
3    2    False
4    4    True
3    7    True
2    4    False
2    3    False

What I'm trying to do is to select consecutive False and True coordinates and produce 2 new dataframes as follows:

in the case of False

X   Y  PROB
2   4   1
3   5   1
3   2   1
2   4   2  
2   3   2

in the case of True

X   Y  PROB
4   4   1
3   7   1

Right now my approach is using .isin but I get KeyError, some ideas?

Upvotes: 1

Views: 202

Answers (2)

BENY
BENY

Reputation: 323316

Or you can try this (PS: drop column Group by using .drop('Group',1))

df['Group']=df.PROB.astype(int).diff().fillna(0).ne(0).cumsum()
df_True=df[df.PROB]
df_False=df[~df.PROB]
df_False.assign(PROB=pd.factorize(df_False.Group)[0]+1)
Out[111]: 
   X  Y  PROB  Group
0  2  4     1      0
1  3  5     1      0
2  3  2     1      0
5  2  4     2      2
6  2  3     2      2

df_True.assign(PROB=pd.factorize(df_True.Group)[0]+1)
Out[112]: 
   X  Y  PROB  Group
3  4  4     1      1
4  3  7     1      1

Upvotes: 2

piRSquared
piRSquared

Reputation: 294488

d1 = df.assign(
    PROB=df.PROB.diff().fillna(False).cumsum()
).groupby(df.PROB).apply(
    lambda d: d.assign(PROB=d.PROB.factorize()[0] + 1)
)

d1

         X  Y  PROB
PROB               
False 0  2  4     1
      1  3  5     1
      2  3  2     1
      5  2  4     2
      6  2  3     2
True  3  4  4     1
      4  3  7     1

d1.xs(True)

   X  Y  PROB
3  4  4     1
4  3  7     1

d1.xs(False)

   X  Y  PROB
0  2  4     1
1  3  5     1
2  3  2     1
5  2  4     2
6  2  3     2

Upvotes: 1

Related Questions