Reputation: 351
I'm working on building a recommendation engine for movies and have read a lot of good information that's out there. One thing I never see mentioned is how to make recommendations for new users and items. The normal process goes: I build my model and train it. I then input a user along with the top k recommendations I want returned for them.
Now, what if I want to do this for a user that was not in my initial sparse ratings matrix? If I have a sparse array of movie ratings for this new user, is there an easy way of incorporating it into the model without re-training the whole model again from scratch?
I know content-based filtering is used to solve the "cold-start" problem of CF. Is that my only option even if I have some ratings for this new user already?
Right now I am looking into Weighted Alternating Least Squares(WALS) and eventually I'll want to do this for SGD as well.
Upvotes: 7
Views: 6157
Reputation: 1869
One thing I never see mentioned is how to make recommendations for new users and items.
This is also a difficult undertaking. In the case of a complete user cold start, additional data must be used to set the user in relation to other (already known) users in advance. Typical approaches use, for example, demographic data to cluster users in advance:
Safoury, Laila, and Akram Salah. "Exploiting user demographic attributes for solving cold-start problem in recommender system." Lecture Notes on Software Engineering 1.3 (2013): 303-307.
Basically, the trick when trying to make suggestions for complete new users is to describe them in terms of the features the algorithm has seen during training phase. The same applies to a complete Item Cold Start. Please note the difference between complete and partial cold start problems. The latter case describes the problem that "sufficient" information about a user/item has to be available.
is there an easy way of incorporating it into the model without re-training the whole model again from scratch?
Yes, there are actually attempts to achieve this. However, this is highly dependent on the factorization approach you are using. You can, for example, consider this paper:
Luo, Xin, Yunni Xia, and Qingsheng Zhu. "Incremental collaborative filtering recommender based on regularized matrix factorization." Knowledge-Based Systems 27 (2012): 271-280.
However, to the best of my knowledge, no implemented solution is available for Python.
Is that my only option even if I have some ratings for this new user already?
If you have few user ratings for individual users, it is often not necessary to use additional information for practical results. However, the results vary greatly depending on the method. In such a case the basic matrix factorization models (e.g., Koren and Bell) do not perform very well. Consider using Ranking-based MF approaches (e.g., LightFM - https://github.com/lyst/lightfm) which can, in addition, take content information into account.
Upvotes: 3
Reputation: 640
I think what you are looking for is the answer how to fold-in a new item/user for the matrix factorization collaborative filtering. And this was already discussed here: How can I handle new users/items in model generated by Spark ALS from MLlib? with places where to find example solutions (with some code examples). It's for Spark ALS implementation, but the main idea stays the same.
Upvotes: 4