Reputation: 379
I have a dataset of 50 Million user-preferences containing 8 million distinct users and 180K distinct products. I am currently using a boolean data model and have a basic tanimoto similarity based recommender in place. I am trying to explore different algorithms for better recommendations and started out with SVD with ALSWR factoriser. I have used the base SVD recommender provided in mahout as follows.
DataModel dataModel = new FileDataModel("/FilePath");
ALSWRFactorizer factorizer = new ALSWRFactorizer(dataModel, 50, 0.065, 15);
recommender = new SVDRecommender(dataModel, factorizer);
As per my basic understanding, i believe the factorisation takes place offline, and it creates the user features and item features. While the actual requests are served by calculating the top products for an user by taking a dot product of user vector and all the possible item vectors.
I have a couple of doubts regarding the approach :-
Upvotes: 4
Views: 1810
Reputation: 1395
I want to answer all your questions together.
Given the size of your data and the real time request, you should take another approach.
Upvotes: 2