count Total rows of an Id from another column

Question

I have a dataframe

Intialise data of lists.

data = {'Id':['1', '2', '3', '4','5','6','7','8','9','10'], 'reply_id':[2, 2,2, 5,5,6,8,8,1,1]}

Create DataFrame

df = pd.DataFrame(data)

   Id   reply_id
0   1   2
1   2   2
2   3   2
3   4   5
4   5   5
5   6   6
6   7   8
7   8   8
8   9   1
9   10  1

I want to get total of reply_id in new for every Id.

Id=1 have 2 time occurrence in reply_id which i want in new column new

Desired output

    Id  reply_id  new
0   1   2          2   
1   2   2          3     
2   3   2          0 
3   4   5          0
4   5   5          2
5   6   6          1
6   7   8          0
7   8   8          2
8   9   1          0
9   10  1          0

I have done this line of code.

df['new'] = df.reply_id.eq(df.Id).astype(int).groupby(df.Id).transform('sum')

Hugolmn · Accepted Answer

In this answer, I used Series.value_counts to count values in reply_id, and converted the result to a dict. Then, I used Series.map on the Id column to associate counts to Id. fillna(0) is used to fill values not present in reply_id

df['new'] = (df['Id']
             .astype(int)
             .map(df['reply_id'].value_counts().to_dict())
             .fillna(0)
             .astype(int))

count Total rows of an Id from another column

Answers (2)

Related Questions