Reputation: 887
I want to exclude rows with the same value in a specific binary variable ("Y"), for each ID in the data frame. It means that if a ID got the same values (only 0 or only 1) in Y, then it should be excluded.
Data illustration:
ID X Y
a .. 0
a .. 0
a .. 0
b .. 1
b .. 0
b .. 1
b .. 0
c .. 1
c .. 1
c .. 1
c .. 1
Expected result:
ID X Y
b .. 1
b .. 0
b .. 1
b .. 0
Upvotes: 2
Views: 129
Reputation: 323266
Since you mentioned filter
df.groupby('ID').filter(lambda x : x['Y'].nunique()>1)
Upvotes: 4
Reputation: 75080
Use groupby()
on ID
and transform
as nunique
, then filter rows with results greater than 1:
df[df.groupby('ID')['Y'].transform('nunique')>1]
ID X Y
3 b .. 1
4 b .. 0
5 b .. 1
6 b .. 0
Upvotes: 6