Reputation: 2672
I have a dataframe like this:
User ID Item Category
U1 A Furniture
U2 B Sports
U3 C Furniture
U2 A Grocery
U3 B Sports
U2. B Sports....
What I want to is to make a dictionary of users who have bought more than 3 same items as another user. For eg:
Lets say User U1 has bought Items A, B, C ,D, E, L, M. User U2 has bought Items A, B, C i.e 3 common items as User U1. User U3 bought B, C, L.
So if I want to find all such users who have bought atleast 3 items as U1, a dictionary should be returned in the following form
{U2: [A, B, C], U3:[B, C, L],....}
I have tried doing it with groupby() but it doesn't work. How do I achieve this??
Thanks
Upvotes: 0
Views: 68
Reputation: 323226
IIUC
ID='U1'
n=1
Ux=df.loc[df.UserID==ID,'Item'].tolist()
s=df.loc[df.Item.isin(Ux)&~df.UserID.isin([ID]),].groupby('UserID').Item.count()
s1=s[s>=n].index.tolist()
d=df.loc[df.UserID.isin(s1),].groupby('UserID').Item.apply(list).to_dict()
d
Out[156]: {'U3': ['C'], 'U4': ['A']}
Upvotes: 3