rosefun
rosefun

Reputation: 1857

How to count the element in a column and take the result as a new column?

The DataFrame named df is shown as follows.

import pandas as pd 
df = pd.DataFrame({'id': [1, 1, 3]})

Input:

   id
0   1
1   1
2   3

I want to count the number of each id, and take the result as a new column count.

Expected:

    id  count
0   1       2
1   1       2
2   3       1

Upvotes: 4

Views: 84

Answers (3)

piRSquared
piRSquared

Reputation: 294198

pd.factorize and np.bincount

My favorite. factorize does not sort and has time complexity of O(n). For big data sets, factorize should be preferred over np.unique

i, u = df.id.factorize()
df.assign(Count=np.bincount(i)[i])

   id  Count
0   1      2
1   1      2
2   3      1

np.unique and np.bincount

u, i = np.unique(df.id, return_inverse=True)
df.assign(Count=np.bincount(i)[i])

   id  Count
0   1      2
1   1      2
2   3      1

Upvotes: 4

Alexander
Alexander

Reputation: 109510

Assign the new count column to the dataframe by grouping on id and then transforming that column with value_counts (or size).

>>> f.assign(count=f.groupby('id')['id'].transform('value_counts'))
   id  count
0   1      2
1   1      2
2   3      1

Upvotes: 3

jezrael
jezrael

Reputation: 862406

Use Series.map with Series.value_counts:

df['count'] = df['id'].map(df['id'].value_counts())
#alternative
#from collections import Counter
#df['count'] = df['id'].map(Counter(df['id']))

Detail:

print (df['id'].value_counts())
1    2
3    1
Name: id, dtype: int64

Or GroupBy.transform for return Series with same size as original DataFrame with GroupBy.size:

df['count'] = df.groupby('id')['id'].transform('size')
print (df)
   id count
0   1     2
1   1     2
2   3     1

Upvotes: 3

Related Questions