I want to count the occurrence of duplicate values in a column in a dataframe and update the count in a new column in python

Question

Example: Let's say I have a df

Id 
A
B
C
A
A
B

It should look like:

Id count
A. 1
B. 1
C. 1
A. 2
A. 3
B. 2

Note: I've tried using the for loop method and while loop option but it works for small datasets but takes a lot of time for large datasets.

for i in df:
    for j in df:
        if i==j:
           count+=1

Terry · Accepted Answer

You can groupby with cumcount, like this:

df['counts'] = df.groupby('Id', sort=False).cumcount() + 1
df.head()

    Id  counts
0   A   1
1   B   1
2   C   1
3   A   2
4   A   3
5   B   2

Answers (2)