Reputation: 3148
I'm using pySpark MLlib and the method of ALS from the box for collaborative filtering. Just wondering, does Spark provide some other methods of doing filtering (for calculating distance), for example Pearson's or Cosine's? Can they be done in Spark environment?
Many thanks!
Upvotes: 1
Views: 141
Reputation: 1702
Yes Spark has an implementation of Cosine similarity.
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/mllib/CosineSimilarity.scala
Example in scala
// Load and parse the data file.
val rows = sc.textFile(params.inputFile).map { line =>
val values = line.split(' ').map(_.toDouble)
Vectors.dense(values)
}.cache()
val mat = new RowMatrix(rows)
val exact = mat.columnSimilarities()
Upvotes: 1