Vignesh
Vignesh

Reputation: 25

How to count duplicate rows and compare the two column values in excel using python

This is rows and columns

email                mark
[email protected]           50
[email protected]           60
[email protected]           50
[email protected]           60
[email protected]           60

This is excepted output

email                   mark    totalcount
[email protected]             50      1
[email protected]             50      1
[email protected]             60      3

This is my python code

df=pd.read_excel('email.xlsx')
df['Total'] = df.mark.apply(lambda x: df.mark.value_counts()[x])
dr = data_file[['email', 'mark', 'totalcount']]
print(dr)

my output came like this

          email        mark    totalcount
0          [email protected]   50     2
1          [email protected]   60     3
2          [email protected]   50     2
3          [email protected]   60     3
4          [email protected]   60     3

how to compare the two-column and add the duplicate row values. so could you please help me out

Upvotes: 1

Views: 443

Answers (1)

Buckeye14Guy
Buckeye14Guy

Reputation: 851

You should take both email and mark into account. I think grouping and transforming would work

df['total_count'] = df.groupby(['email', 'mark'])['mark'].transform('count')
dr = df.drop_duplicates()

Output:

      email      mark  total_count
0  [email protected]    50            1
1  [email protected]    60            3
2  [email protected]    50            1

Upvotes: 3

Related Questions