Reputation: 2539
How can this query be reproduced in pandas?
SELECT (SELECT COUNT(1) FROM table t1 WHERE t1.full_name = t2.full_name) FROM table t2;
I want to count the number of rows that have the same value and save the value to the data frame.
What I tried until now is
df.full_name.values_count()
Which returns me a list of unique values, I then tried to merge it to the original database (with df.merge) but had no luck.
Upvotes: 0
Views: 2280
Reputation: 52226
First, wrap the value counts (which is a Series) in a DataFrame.
counts = pd.DataFrame(df['full_name'].value_counts(), columns=['num_unique'])
Then, merge, left on the original column, and right on the index.
df = df.merge(counts, left_on=['full_name'], right_index=True)
Upvotes: 1