LonsomeHell
LonsomeHell

Reputation: 593

Compute cosine similarity spark java

How to compute cosine similarity between 2 Spark Vector. I am using the new ml package.

Spark 2.1.1

EDIT:

Spark provide RowMatrix which can be used to compute similarity but it accepts mllib.vector not an ml.vector.

Is there a way to convert Vectors from the different packages? Is there an implementation that uses ml.vector?

Upvotes: 0

Views: 966

Answers (1)

Shaido
Shaido

Reputation: 28322

The easiest way to convert from an mllib vector to an ml vector is to use the Vectors.fromML method, see Vectors documentation. Example:

val mlVector = org.apache.spark.ml.linalg.Vectors.dense((Array(1.0,2.0,3.0)))
println(mlVector.getClass())

val mllibVector = org.apache.spark.mllib.linalg.Vectors.fromML(mlVector)
println(mllibVector.getClass())

Gives an output:

class org.apache.spark.ml.linalg.DenseVector
class org.apache.spark.mllib.linalg.DenseVector

Upvotes: 2

Related Questions