Krush23
Krush23

Reputation: 741

Find Duplicate rows from df. Python

df = 

Name    Age City
Jack    34  Sydney
Riti    30  Delhi
Aadi    16  New York
Riti    30  Delhi
Riti    30  Delhi
Riti    30  Mumbai
Aadi    40  London
Sachin  30  Delhi
df[df.duplicated(keep='last')]

The above code gives the list of duplicated. But what I need is if the df contains atleast 1 duplicate, then it should return The df contains duplicate rows.

Upvotes: 1

Views: 141

Answers (2)

Seleme
Seleme

Reputation: 251

duplicated actually returns a Series containing boolean values for each row. If the row has a duplicate then the corresponding row in the returned Series has a "True" value.

Hence, you can do the below:

df.duplicated().any()

It will return True if there is any duplicate in your DataFrame.

Upvotes: 1

Sayandip Dutta
Sayandip Dutta

Reputation: 15872

You can use any:

>>> df
     Name  Age     City
0    Jack   34   Sydney
1    Riti   30    Delhi
2    Aadi   16  NewYork
3    Riti   30    Delhi
4    Riti   30    Delhi
5    Riti   30   Mumbai
6    Aadi   40   London
7  Sachin   30    Delhi
>>> df.duplicated().any()
True
>>> 'The df contains duplicates' if df.duplicated().any() else 'no duplicates' 
'The df contains duplicates'

Upvotes: 1

Related Questions