Reputation: 967
I have dataframe
looks like this:
df
Data1 Data2 Data3
A XX AA
A YY AA
B XX BB
B YY CC
C XX DD
C YY DD
D XX EE
D YY FF
I want to delete all the row (column data3) based on two columns (data1 and data2) with the condition if the data on data3 is same the delete.
my expected result looks like this:
Data1 Data2 Data3
B XX BB
B YY CC
D XX EE
D YY FF
how to do it?
Upvotes: 0
Views: 72
Reputation: 71560
You can also use a groupby
with nunique
, and a selection of rows:
>>> group = df.groupby('Data1')['Data3'].nunique()
>>> df[df['Data1'].isin(group[group.gt(1)].index)]
Data1 Data2 Data3
2 B XX BB
3 B YY CC
6 D XX EE
7 D YY FF
>>>
Upvotes: 1
Reputation: 323226
Using groupby
+ transform
with nunique
yd=df[df.groupby(['Data1']).Data3.transform('nunique').gt(1)].copy()
Out[506]:
Data1 Data2 Data3
2 B XX BB
3 B YY CC
6 D XX EE
7 D YY FF
Upvotes: 2