David Parks
David Parks

Reputation: 32051

Mahout Recommender: What relative preference values are suitable for a GenericUserBasedRecommender?

In mahout, I'm setting up a GenericUserBasedRecommender, pretty straight forward for now, typical settings.

In generating a "preference" value for an item, we have the following 5 data points:

Positive interest

Negative interest

Over what range I should express these different attributes, let's use a 1-100 scale for discussion?

I know the final answer lies in trial and error and in the meaning of our data, but as far as the algorithm goes, I'm trying to understand at what point I need to tip the scales between interest and disinterest for the algorithm to function properly.

Upvotes: 1

Views: 462

Answers (1)

Sean Owen
Sean Owen

Reputation: 66876

The actual range does not matter, not for this implementation. 1-100 is OK, 0-1 is OK, etc. The relative values are all that really matters here.

These values are estimated by a simple (linearly) weighted average. Therefore the response ought to be "linear". It ought to match an intuition that if action X gets a score 2x higher than action Y, then X should be an indicator of twice as much interest in real life.

A decent place to start is to simply size them relative to their frequency. If click-to-conversion rate is 2%, you might make a click worth 2% of a conversion.

I would ignore the "Indifference" signal you propose. It is likely going to be too noisy to be of use.

Upvotes: 3

Related Questions