luke14free
luke14free

Reputation: 2539

Pandas create column containing a row count

How can this query be reproduced in pandas?

SELECT (SELECT COUNT(1) FROM table t1 WHERE t1.full_name = t2.full_name) FROM table t2;

I want to count the number of rows that have the same value and save the value to the data frame.

What I tried until now is

df.full_name.values_count() 

Which returns me a list of unique values, I then tried to merge it to the original database (with df.merge) but had no luck.

Upvotes: 0

Views: 2280

Answers (1)

chrisb
chrisb

Reputation: 52226

First, wrap the value counts (which is a Series) in a DataFrame.

counts = pd.DataFrame(df['full_name'].value_counts(), columns=['num_unique'])

Then, merge, left on the original column, and right on the index.

df = df.merge(counts, left_on=['full_name'], right_index=True)

Upvotes: 1

Related Questions