Ugly Executioner
Ugly Executioner

Reputation: 33

I want to count the occurrence of duplicate values in a column in a dataframe and update the count in a new column in python

Example: Let's say I have a df

Id 
A
B
C
A
A
B

It should look like:

Id count
A. 1
B. 1
C. 1
A. 2
A. 3
B. 2

Note: I've tried using the for loop method and while loop option but it works for small datasets but takes a lot of time for large datasets.

for i in df:
    for j in df:
        if i==j:
           count+=1

Upvotes: 2

Views: 733

Answers (2)

Terry
Terry

Reputation: 2811

You can groupby with cumcount, like this:

df['counts'] = df.groupby('Id', sort=False).cumcount() + 1
df.head()

    Id  counts
0   A   1
1   B   1
2   C   1
3   A   2
4   A   3
5   B   2

Upvotes: 3

Sunil Jaitade
Sunil Jaitade

Reputation: 56

dups_values = df.pivot_table(index=['values'], aggfunc='size')
print(dups_values)

Upvotes: 0

Related Questions