How to count duplicate rows and compare the two column values in excel using python

Question

This is rows and columns

email                mark
A@email.com           50
B@email.com           60
B@email.com           50
B@email.com           60
B@email.com           60

This is excepted output

email                   mark    totalcount
A@email.com             50      1
B@email.com             50      1
B@email.com             60      3

This is my python code

df=pd.read_excel('email.xlsx')
df['Total'] = df.mark.apply(lambda x: df.mark.value_counts()[x])
dr = data_file[['email', 'mark', 'totalcount']]
print(dr)

my output came like this

          email        mark    totalcount
0          A@email.com   50     2
1          B@email.com   60     3
2          B@email.com   50     2
3          B@email.com   60     3
4          B@email.com   60     3

how to compare the two-column and add the duplicate row values. so could you please help me out

Buckeye14Guy · Accepted Answer

You should take both email and mark into account. I think grouping and transforming would work

df['total_count'] = df.groupby(['email', 'mark'])['mark'].transform('count')
dr = df.drop_duplicates()

Output:

      email      mark  total_count
0  A@email.com    50            1
1  B@email.com    60            3
2  B@email.com    50            1

How to count duplicate rows and compare the two column values in excel using python

Answers (1)

Related Questions