Reputation: 1
Can I compute recommendations for new users with item preferences within the same item set, and knowing the item similarity matrix from previously existing user-ratings, without recomputing the similarity matrix?
Upvotes: 0
Views: 832
Reputation: 5702
Not using the Mahout recommenders. They cannot recommend for users or items not in the training set.
However you can use Mahout's itemsimilairty job along with a search engine to do exactly what you describe. Use itemsimilarity to create an "indicator" matrix of item-item similarities. Index these using something like Solr. I do this by creating a CSV of (itemID0,itemID1 itemIDn...) Each line has an itemID as the document ID and a space delimited list of itemID tokens. Maybe use application specific IDs like SKUs or catalog IDs.
Then for the search query use the new user's history expressed as item ID tokens (the same ones you have indexed). You will get back an ordered list of items to recommend even though the user was not in the training data.
If you use the Mahout 1.0 snapshot there is now a spark-itemsimilarity that takes in your application specific IDs and outputs the same in the exact format you will feed to the search engine, so there may be no data prep on your part. But with a little prep and post processing you can do the same with the hadoop version of itemsimilarity in 0.9
This technique is described in "Practical Machine Learning" by Ted Dunning of MapR. You can get a free copy of it on their site or ask about it on the [email protected] mailing list.
There is a demo site built using this technique at https://guide.finderbots.com. You can see it work by registering and going through the trainer pages, then checking your own recommendations--no recalculation of the indicators between your input and getting your recs.
Upvotes: 2