Natalia
Natalia

Reputation: 71

How to count duplicates in Pandas?

I use this pattern to find duplicates in column A from set duplicates:

duplicates = {1, 2, 3}
df[~df['A'].isin(duplicates)]

It works and returns me rows witout duplicates. But how to get count of duplicates?

I have tried these:

df[~df['A'].isin(duplicates)].count()
~df['A'].isin(duplicates).count()

And how to extract this count to variable.

Upvotes: 0

Views: 119

Answers (1)

Olasimbo
Olasimbo

Reputation: 1063

new_df = df[~df['A'].isin(duplicates)]
new_df['duplicate_values'] = new_df.duplicated('A')
new_df['duplicate_values'].sum()

Upvotes: 2

Related Questions