Reputation: 25
This is rows and columns
email mark
[email protected] 50
[email protected] 60
[email protected] 50
[email protected] 60
[email protected] 60
This is excepted output
email mark totalcount
[email protected] 50 1
[email protected] 50 1
[email protected] 60 3
This is my python code
df=pd.read_excel('email.xlsx')
df['Total'] = df.mark.apply(lambda x: df.mark.value_counts()[x])
dr = data_file[['email', 'mark', 'totalcount']]
print(dr)
my output came like this
email mark totalcount
0 [email protected] 50 2
1 [email protected] 60 3
2 [email protected] 50 2
3 [email protected] 60 3
4 [email protected] 60 3
how to compare the two-column and add the duplicate row values. so could you please help me out
Upvotes: 1
Views: 443
Reputation: 851
You should take both email and mark into account. I think grouping and transforming would work
df['total_count'] = df.groupby(['email', 'mark'])['mark'].transform('count')
dr = df.drop_duplicates()
Output:
email mark total_count
0 [email protected] 50 1
1 [email protected] 60 3
2 [email protected] 50 1
Upvotes: 3