ZMath_lin
ZMath_lin

Reputation: 553

calculate RMSE in ALS model

I'd like to calculate RMSE in ALS model, i find code like this:

    val ratings = data.map(_.split(',') match {
      case Array(user,item,rate)
      =>
        Rating(user.toLong,item.toInt,rate.toFloat)
    })

    val ratingsDF= ratings.toDF

    val model = new ALS().setRank(3).setMaxIter(10).fit(ratingsDF)
    val predictions = model.transform(ratingsDF)
    val evaluator = new RegressionEvaluator().setMetricName("rmse").setLabelCol("rating").setPredictionCol("prediction")
    val rmse = evaluator.evaluate(predictions)
    System.out.println("Root-mean-square error = " + rose)

However, i get "NaN" . I wonder if the method i use is wrong or it is the problem with the data itself. If it is wrong with the code, what is the right way to calculate the RMSE? I only find method like:

    var predictions = model.predict(usersProducts).map { case Rating(user, product, rate) =>        ((user, product), rate)    }
    val ratesAndPreds = ratings.map { case Rating(user, product, rate) =>      ((user, product), rate)    }.join(predictions) 
    val rmse= math.sqrt(ratesAndPreds.map { case ((user, product), (r1, r2)) =>      val err = (r1 - r2)      err * err    }.mean())println(s"RMSE = $rmse")

This cannot be used here. How to do it?

Upvotes: 4

Views: 1510

Answers (1)

Chris Snow
Chris Snow

Reputation: 24606

This appears to be a defect. For more information, have a look at this spark JIRA: https://issues.apache.org/jira/browse/SPARK-14489

When building a Spark ML pipeline containing an ALS estimator, the metrics "rmse", "mse", "r2" and "mae" all return NaN.

Upvotes: 1

Related Questions