Reputation: 339
I am working with python 3.5 with a DataFrame with columns = ['users_id', 'item_id', 'rating', 'timestamp', 'title'] and i am using
model = LightFM(loss='warp')
for recommender model
so for the trainning i need a sparseMatrix in a specific format => (users_id, item_id) rating
but i never succeeded when i use thisscipy.sparse.csr_matrix(data['users_id'])
. It gives me something like this :
(0,0) 5
(0,1) 5
(0,2) 4
(0,3) 5
How should i procced ?
Upvotes: 4
Views: 1195
Reputation: 86
If you want to create a sparse matrix to after use it in your LightFM model, I think you should use the Dataset object which is provided by the library. For example, if I call your DataFrame df :
from lightfm.data import Dataset
data = Dataset()
data.fit(df.users_id.unique(), df.item_id.unique())
interactions_matrix, weights_matrix = data.build_interactions([tuple(i) for i in df.drop(['timestamp', 'title'], axis = 1).values])
The fit method is use to map your users_id and items_id to an inner id and the build_interactions method create two sparse matrix, one binary with only the interactions between users and items and an other one with the weights (i.e. ratings), it takes an iterable of (user_id, item_id) or (user_id, item_id, weight) as parameter.
Then you can use these two matrices created with build_interactions to fit your model in LightFM.
from lightfm import LightFM
model = LightFM(loss='warp')
model.fit(interactions_matrix, sample_weight = weights_matrix)
You can find more information in the LightFM documentation, you can see for example the section about Building Datasets or the Quickstart.
Upvotes: 7