Arief Hidayat
Arief Hidayat

Reputation: 967

Drop row based on two columns conditions

I have dataframe looks like this:

df
Data1   Data2   Data3
A       XX      AA
A       YY      AA
B       XX      BB
B       YY      CC
C       XX      DD
C       YY      DD
D       XX      EE
D       YY      FF

I want to delete all the row (column data3) based on two columns (data1 and data2) with the condition if the data on data3 is same the delete.

my expected result looks like this:

Data1   Data2   Data3
B       XX      BB
B       YY      CC
D       XX      EE
D       YY      FF

how to do it?

Upvotes: 0

Views: 72

Answers (2)

U13-Forward
U13-Forward

Reputation: 71560

You can also use a groupby with nunique, and a selection of rows:

>>> group = df.groupby('Data1')['Data3'].nunique()
>>> df[df['Data1'].isin(group[group.gt(1)].index)]
  Data1 Data2 Data3
2     B    XX    BB
3     B    YY    CC
6     D    XX    EE
7     D    YY    FF
>>> 

Upvotes: 1

BENY
BENY

Reputation: 323226

Using groupby + transform with nunique

yd=df[df.groupby(['Data1']).Data3.transform('nunique').gt(1)].copy()
Out[506]: 
  Data1 Data2 Data3
2     B    XX    BB
3     B    YY    CC
6     D    XX    EE
7     D    YY    FF

Upvotes: 2

Related Questions