Natalia
Natalia

Reputation: 91

infinity vectors in Spark MLlib word2vec

I have a question about running word2vec of Spark MLlib. I run it with vocabulary size ~2.4M and corpus size ~1.4B. What is the reason to get +-infinity vectors for some words? It happens when I increase the number of iterations, namely, with 10 iteration I get a reasonable model, and with 20 iteration I get some vectors of the form [Infinity,-Infinity,Infinity,-Infinity,...]. Thanks in advance.

Upvotes: 9

Views: 380

Answers (1)

user48135
user48135

Reputation: 481

you can do like this for each vector elements:

  def input_data(data_input:Double):Double =  {
  var result = data_input
  if (data_input.isInfinity || data_input.isNaN){
    result =0
  }
  result
}

Upvotes: -2

Related Questions