Reputation: 19
I'm trying to build an user based collaborative filtering in MLlib to find similar users from the last-fm dataset (based on artists that you listen to).
Apache Mahout can do what I want to achieve through GenericBooleanPrefUserBasedRecommender but is not fast enough so I wanted to try Spark & MLlib but can't find any implementation of it. Does anyone have a working java/scala/python implementation of this or idea how to implement it? I know that MLlib has item-based recommendations through ALS but that is different.
Upvotes: 0
Views: 1427
Reputation: 5702
Apache Mahout has a Spark version of "item-similarity" that has been integrated into the ActionML Universal Recommender. Mahout has been extended to be based on a new algorithm for cross-correlation that allows almost any user action to be used in finding similar users or recommending items.
The Spark version of Mahout's spark-rowsimilarity is here. In a recommender input you have (user-id, "action-name", item-id). Accumulating all input gives you a table where rows = users and columns = items. So rowsimilarity will create output that lists a user as the key and the most similar users as the values. This is not ideal since it uses only one "action" to see similarities. For the full power of Correlated Cross-Occurrence, which is the full version of the Mahout Algorithm, you may want to look into the Universal Recommender.
Upvotes: 1