Ali Maan
Ali Maan

Reputation: 83

Generating User / Item interactions

I have a pandas dataframe (Interactions dataframe) with columns as User, Item, Rating.

Ratings ItemID   UserID
1      1172952     A74
1      1178735     176
4      341785     70C
3      136771     67E
2      1178883     383

Let's say I have two more dataframes with 200 users and 1000 ietms respectively. The problem is that in interactions dataframe, i need ratings for each user and each item combination. 0 rating for user/item combination if there is no interaction available for that user and item in interactions dataframe.

I have tried using a loop like this:

item_ids = np.repeat(item_data.id.values, len(user_data.id.values))
user_ids = np.tile(user_data.id.values, len(item_data.id.values))
ratings = np.empty([len(item_ids)])

for i in range(len(ratings)):
    inter = interactions.loc[(interactions['UserID'] == user_ids[i]) & (interactions["ItemID"] == item_ids[i]), "Ratings"]
    if not inter.empty:
        ratings[i] = inter.values[0]
    else:
        ratings[i] = 0

interactions = np.stack((ratings, item_ids, user_ids), axis=-1)

But it takes 40 seconds to complete for a ratings array of just 30,000 rows. Is there a quick way of doing this? Thanks for help.

Upvotes: 2

Views: 211

Answers (1)

DYZ
DYZ

Reputation: 57033

Your explanation of the problem is a little sloppy, but I have a feeling that you need this:

interactions.set_index(['ItemID','UserID'])\
            .unstack().fillna(0).astype(int).stack()\
            .reset_index()

This code creates a rectangular table of users and items, fills the voids with zeros, and converts the table back into a "tall" vector. Output:

     ItemID UserID  Ratings
0    136771    176        0
1    136771    383        0
2    136771    67E        3
3    136771    70C        0
4    136771    A74        0
5    341785    176        0
6    341785    383        0
7    341785    67E        0
8    341785    70C        4
9    341785    A74        0
10  1172952    176        0
....

I assume that each item and each user (but not their combinations!) are referenced in the interaction table at least once. If not, some merge'ing with the other two tables is needed.

Upvotes: 2

Related Questions