Reputation: 469
Spark ML implementation of Alternate Least Squares algorithm for recommendations, produces a model than can be applied to already watched items (training.itemCol, for watched movies, in the example below), in order to suggest new items (movieRecs in the example). How is it possible that the method returns already watched (non-new) items as part of the results (userRecs)?
als = ALS(maxIter=5, regParam=0.01, userCol="userId", itemCol="movieId", ratingCol="rating",
coldStartStrategy="drop")
model = als.fit(training)
userRecs = model.recommendForAllUsers(10)
Upvotes: 0
Views: 642
Reputation: 71
All recommendForAllUsers()
does is
n
products based on the value of dot product, for each user.In the above process, the cross join broadens the user-item space. Since the model was trained with those items (watched movies), the dot product for them will always be significant. Hence, the items appearing for a user at the training time appears at the time of recommendation for the same user.
Upvotes: 2