Cannon
Cannon

Reputation: 319

Df.drop/delete duplicate rows

How can I drop the exact duplicates of a row. So if I have a data frame that looks like so:

A    B    C
1    2    3
3    2    2
1    2    3

now my data frame is a lot larger than this but is their a way that we can have python look at every row and if the values in the rows are the exact same as another row just drop or delete that row. I want to take in to account for the whole data frame i don't want to specify the column I want to get unique values for.

Upvotes: 0

Views: 646

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210982

you can use DataFrame.drop_duplicates() method:

In [23]: df
Out[23]:
   A  B  C
0  1  2  3
1  3  2  2
2  1  2  3

In [24]: df.drop_duplicates()
Out[24]:
   A  B  C
0  1  2  3
1  3  2  2

Upvotes: 3

mechanical_meat
mechanical_meat

Reputation: 169524

You can get a de-duplicated dataframe with the inverse of .duplicated:

df[~df.duplicated(['A','B','C'])]

Returns:

>>> df[~df.duplicated(['A','B','C'])]
   A  B  C
0  1  2  3
1  3  2  2

Upvotes: 2

Related Questions