TestGuest
TestGuest

Reputation: 603

Counting duplicated elements in pandas dataframe

I want to count the number of duplicated elements in a pandas dataframe "data", specifically here in the roi column, and input this number into each corresponding row of the count column.

For instance, roi 35 appears twice, hence each of the rows in the count column should have a "2".

Right now I tried the following:

data['count'] = data.groupby('roi').roi.count()

But this fails. What can I do?

enter image description here

Upvotes: 1

Views: 148

Answers (3)

ansev
ansev

Reputation: 30940

Use GroupBy.transform:

data['count'] = data.groupby('roi').roi.transform('size') 

or Series.map + Series.value_counts:

data['count']=data.roi.map(data.roi.value_counts())

Upvotes: 1

srty
srty

Reputation: 130

roi_count = data.groupby('roi')['roi'].count().reset_index(name = 'count')

final_df = pd.merge(data,roi_count, how = 'left', on = 'roi')

Upvotes: 1

Hussain Abdullah
Hussain Abdullah

Reputation: 43

try using this line:

data['count'] = data.groupby(['roi']).size().reset_index(name='count')

the reset_index() function in the last is to display the count of the repeating number. You can skip it, if you want.

Upvotes: 2

Related Questions