Reputation: 4353

Add column with number of ratings per user, pandas

I am working with a book rating dataset of the form

userID | ISBN | Rating
23413    1232     2.5
12321    2311     3.2
23413    2532     1.7
23413    7853     3.8

Now I need to add a fourth column that contains the number of ratings each user has in the entire dataset:

userID | ISBN | Rating | Ratings_per_user
23413    1232     2.5         3
12321    2311     3.2         1
23413    2532     1.7         3 
23413    7853     3.8         3

I have tried:

df_new['Ratings_per_user'] = df_new['userID'].value_counts()

but I get an error:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

and the entire new column is filled with NaN.

Upvotes: 3

Answers (3)

heena bawa

Reputation: 828

you can use map:

df['Rating per user'] = df['userID'].map(df.groupby('userID')['Rating'].count())
print(df)

   userID  ISBN  Rating  Rating per user
0   23413  1232     2.5                3
1   12321  2311     3.2                1
2   23413  2532     1.7                3
3   23413  7853     3.8                3

Upvotes: 0

anky

Reputation: 75080

Use:

df_new['Ratings_per_user']=df_new.groupby('userID')['userID'].transform('count')

   userID  ISBN  rating  Ratings_per_user
0   23413  1232     2.5                 3
1   12321  2311     3.2                 1
2   23413  2532     1.7                 3
3   23413  7853     3.8                 3

Upvotes: 1

Sociopath

Reputation: 13401

Convert result of value_counts into dict and then use replace to create new column with user ratings

x = df['userID'].value_counts().to_dict()

df['rating_per_user'] = df['userID'].replace(x)
print(df)

Output:

  userID  ISBN  rating  rating_per_user                                                                                              
0   23413  1232     2.5                3                                                                                              
1   12321  2311     3.2                1                                                                                              
2   23413  2532     1.7                3                                                                                              
3   23413  7853     3.8                3

Upvotes: 1

Add column with number of ratings per user, pandas

Answers (3)

Related Questions