peleitor
peleitor

Reputation: 469

Spark ML ALS recommendation algorithm returning already watched items

Spark ML implementation of Alternate Least Squares algorithm for recommendations, produces a model than can be applied to already watched items (training.itemCol, for watched movies, in the example below), in order to suggest new items (movieRecs in the example). How is it possible that the method returns already watched (non-new) items as part of the results (userRecs)?

als = ALS(maxIter=5, regParam=0.01, userCol="userId", itemCol="movieId", ratingCol="rating",
          coldStartStrategy="drop")

model = als.fit(training)

userRecs = model.recommendForAllUsers(10)

Upvotes: 0

Views: 642

Answers (1)

undying_odyssey
undying_odyssey

Reputation: 71

All recommendForAllUsers() does is

  • Take the user blocks and item blocks (along with their vector embeddings) learnt at the time of training.
  • Compute the cross join of user and item blocks.
  • Calculate dot product of the embeddings.
  • Select top n products based on the value of dot product, for each user.

In the above process, the cross join broadens the user-item space. Since the model was trained with those items (watched movies), the dot product for them will always be significant. Hence, the items appearing for a user at the training time appears at the time of recommendation for the same user.

Upvotes: 2

Related Questions