Haha TTpro
Haha TTpro

Reputation: 5556

How to use sqrt on Double in Spark Scala

I am trying to calculate Root Mean Square Error (RMSE) manually on Spark (Scala 2.11)

se

As screenshot above, I calculate Square Error (SE) for each row

val predicted_with_sqr_err = predicted.withColumn("se", pow(($"medianHouseValue" - $"prediction"), lit(2)))

Then I calculate Mean Square Error (MSE)

val sum_se = predicted_with_sqr_err.agg(sum("se")).first.get(0)
val sum_se_double = sum_se.toString.toDouble
val mean_sqr_err = (1.0/predicted_with_sqr_err.count)*sum_se_double 

It worked fine. But when I trying to square root to calculate Root Mean Square Error (RMSE).

val root_mean_sqr_err = sqrt(mean_sqr_err)

It give error:

<console>:83: error: overloaded method value sqrt with alternatives:
  (colName: String)org.apache.spark.sql.Column <and>
  (e: org.apache.spark.sql.Column)org.apache.spark.sql.Column
 cannot be applied to (Double)
       val root_mean_sqr_err = sqrt(mean_sqr_err)

sqrt error

How should I fix ?

Upvotes: 1

Views: 2678

Answers (1)

Vitalii Honta
Vitalii Honta

Reputation: 317

The problem is that you are using sqrt function which is defined in Spark SQL. This function should be used only as a part of Spark SQL DSL (in selections, aggregations, etc.). It takes Column or String as a parameter but you are trying to pass Double. Instead use sqrt function which is defined in scala.math package:

val root_mean_sqr_err = math.sqrt(mean_sqr_err)

Upvotes: 2

Related Questions