ComplexData
ComplexData

Reputation: 1093

Display rows where any value in a particular column occurs more than once

I want to display all the rows where any value in the column - "Website" occurs more than once. For example - if a certain website "xyz.com" occurs more than once, then I want to display all those rows. I am using the below code -

df[df.website.isin(df.groupby('website').website.count() > 1)]

Above code returns zero rows. But I can actually see that there are so many websites that occurs more than once by running the below code -

df.website.value_counts()

How should I modify my 1st line of code to display all such rows?

Upvotes: 4

Views: 3264

Answers (1)

root
root

Reputation: 33793

Use duplicated with subset='website' and keep=False:

df[df.duplicated(subset='website', keep=False)]

Sample Input:

  col1  website
0    A  abc.com
1    B  abc.com
2    C  abc.com
3    D  abc.net
4    E  xyz.com
5    F  foo.bar
6    G  xyz.com
7    H  foo.baz 

Sample Output:

  col1  website
0    A  abc.com
1    B  abc.com
2    C  abc.com
4    E  xyz.com
6    G  xyz.com

Upvotes: 6

Related Questions