Hadrien Berthier
Hadrien Berthier

Reputation: 305

Pandas drop duplicate base on 2 columns, having differents value

How to drop duplicate in that specific way:

Index B C
1     2 1
2     2 0
3     3 1
4     3 1
5     4 0
6     4 0 
7     4 0
8     5 1
9     5 0
10    5 1

Desired output :

Index B C
3     3 1
5     4 0

So dropping duplicate on B but if C is the same on all row and keep one sample/record.

For example, B = 3 for index 3/4 but since C = 1 for both, I do not destroy them all

But for example B = 5 for index 8/9/10 since C = 1 or 0, it get destroy.

Upvotes: 1

Views: 43

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Try this, using transform with nunique and drop_duplicates:

df[df.groupby('B')['C'].transform('nunique') == 1].drop_duplicates(subset='B')

Output:

       B  C
Index      
3      3  1
5      4  0

Upvotes: 1

Related Questions