Reputation: 213
Given a dataframe as follows:
col1 col2
a 0 True
b 0 True
c 1 True
d 1 False
e 2 False
f 2 False
g 3 True
For each unique value in col1, I would like to check if all values in col2 are matching, otherwise drop all rows for that corresponding value such that this would yield:
col1 col2
a 0 True
b 0 True
e 2 False
f 2 False
g 3 True
Upvotes: 1
Views: 1892
Reputation: 326
What have you tried? Seems like a reasonably straightforward problem. I'd use shape and drop_duplicates():
data=[
col1 col2
a 0 True
b 0 True
c 1 True
d 1 False
e 2 False
f 2 False
g 3 True
]
cols=["col1","col2"]
df=pd.DataFrame(data, columns=cols)
for ind, row in df.drop_duplicates(subset=["col1"])["col1"].iteritems():
df1=df[df["col1"]==row]
if df1.shape[0] == df1.drop_duplicates().shape[0]:
#logic goes here
If the shapes match after dropping duplicates, they're different values. If not, then drop the whole subset and don't build the new dataframe with those rows.
Upvotes: -1
Reputation: 150785
You want nunique
:
df[df.groupby('col1')['col2'].transform('nunique').eq(1)]
Output:
col1 col2
a 0 True
b 0 True
e 2 False
f 2 False
g 3 True
Upvotes: 2
Reputation: 31001
One of possible solutions:
Group by col1
and filter each group checking whether all col2
values
are True or all of them are False:
df.groupby('col1').filter(lambda x: x.col2.all() | (~x.col2).all())
Upvotes: 0