Stepan Yakovenko
Stepan Yakovenko

Reputation: 9216

What is the right way to increment mahout recommender model?

I have a stream of user-item pairs, hold a block based on last 6M records and update it each minute. I don't like that between these rebuilds some important data might be unused. For example new user has joined the system, but the model doesn't know about him yet. I've found class PlusAnonymousConcurrentUserDataModel, which allows to add few entries to the model and get more accurate recommendation. Documentation proposes more constrained usage scenario for it yet: I have to:

Is it ok to use this class for collecting data iteratively till model is actually rebuilt by timer? What is the right way to do this? It seems that PlusAnonymousConcurrentUserDataModel is a bit for different purposes.

Upvotes: 1

Views: 131

Answers (1)

pferrel
pferrel

Reputation: 5702

This part of Mahout is very old an being deprecated. I think it is not even in the 0.14.0 build, you would have to build from source.

Mahout now uses a whole new technology for recommending. The new algorithm is called Correlated Cross-Ocurrence (CCO). The old method you are using does not make use of real time input as you have outlined. CCO can recommend to anonymous users that have not been built into the model as long as there is behavioral data for them in some form.

The architecture to implement CCO requires a datastore in a DB and a KNN engine (search engine) to make model queries. These are all packaged together in Apache PredictionIO + the Universal Recommender template.

Community support for the Universal Recommender itself can be found here: https://groups.google.com/forum/#!forum/actionml-user or on the mailing lists of the other projects.

Upvotes: 1

Related Questions