How to find unique values in a column A based by unique users in column B?

Question

I have a data frame which looks like this:

df=
['UserId','SessionId','Item_class']
[1       ,34         ,'toy'       ]
[1       ,35         ,'book'      ]
[2       ,36         ,'book'      ]

Note that there is a 1:n relationship between UserId and SessionId as 1 user can have multiple session in which they purchase an item.

I need to find out how many unique items a user purchased in an output like this:

 df=
    ['UserId','number_items']
    [1       ,2             ]
    [2       ,1             ]

I found many topics which discuss only how to get a unique value for a column df.Item_class.unique() but I didn't find anything that breaks that down by a sub-column, in this case, UserId.

Hope someone can help. thanks

Georgina Skibinski · Accepted Answer

Try this one:

>>> df.groupby("UserId").Item_class.nunique()
UserId
1    2
2    1

It counts unique Item_class per UserID

How to find unique values in a column A based by unique users in column B?

Answers (1)

Related Questions