Talha
Talha

Reputation: 85

count Total rows of an Id from another column

I have a dataframe

Intialise data of lists.

data = {'Id':['1', '2', '3', '4','5','6','7','8','9','10'], 'reply_id':[2, 2,2, 5,5,6,8,8,1,1]} 

Create DataFrame

df = pd.DataFrame(data)

   Id   reply_id
0   1   2
1   2   2
2   3   2
3   4   5
4   5   5
5   6   6
6   7   8
7   8   8
8   9   1
9   10  1

I want to get total of reply_id in new for every Id.

Id=1 have 2 time occurrence in reply_id which i want in new column new

Desired output

    Id  reply_id  new
0   1   2          2   
1   2   2          3     
2   3   2          0 
3   4   5          0
4   5   5          2
5   6   6          1
6   7   8          0
7   8   8          2
8   9   1          0
9   10  1          0

I have done this line of code.

df['new'] = df.reply_id.eq(df.Id).astype(int).groupby(df.Id).transform('sum')

Upvotes: 3

Views: 129

Answers (2)

Hugolmn
Hugolmn

Reputation: 1560

In this answer, I used Series.value_counts to count values in reply_id, and converted the result to a dict. Then, I used Series.map on the Id column to associate counts to Id. fillna(0) is used to fill values not present in reply_id

df['new'] = (df['Id']
             .astype(int)
             .map(df['reply_id'].value_counts().to_dict())
             .fillna(0)
             .astype(int))

Upvotes: 2

Shubham Sharma
Shubham Sharma

Reputation: 71689

Use, Series.groupby on the column reply_id, then use the aggregation function GroupBy.count to create a mapping series counts, finally use Series.map to map the values in Id column with their respective counts:

counts = df['reply_id'].groupby(df['reply_id']).count()
df['new'] = df['Id'].map(counts).fillna(0).astype(int)

Result:

# print(df)

   Id  reply_id  new
0   1         2    2
1   2         2    3
2   3         2    0
3   4         5    0
4   5         5    2
5   6         6    1
6   7         8    0
7   8         8    2
8   9         1    0
9  10         1    0

Upvotes: 1

Related Questions