pnet_fabric
pnet_fabric

Reputation: 89

apache spark add column which is a complex calculation

I have a following dataset df1 in Spark:

root
 |-- id: integer (nullable = true)
 |-- t: string (nullable = true)
 |-- x: double (nullable = false)
 |-- y: double (nullable = false)
 |-- z: double (nullable = false)

and I need to create a column which will be a kind of calculation result of

sqrt(x)+cqrt(y)+z*constantK

I'm trying something like following:

val constantK=100500
val df2= df1.select($"id", (scala.math.sqrt($"x")+scala.math.cqrt($"y")+$"z"*constantK ))

however, I got a type mismatch error

<console>:59: error: type mismatch;
 found   : org.apache.spark.sql.ColumnName
 required: Double
       val df2= df1.select($"id", (scala.math.sqrt($"x")+scala.math.cqrt($"y")+$"z"*constantK ))

what is the proper way of adding columns with complex calculations which are based on the values of other columns in the dataframe?

Upvotes: 0

Views: 145

Answers (1)

Emiliano Martinez
Emiliano Martinez

Reputation: 4123

Because you are trying tu use Scala.math functions in Spark SQL. SparkSQL has its own operations and types:

import org.apache.spark.sql.functions.sqrt

df1.select($"id", (sqrt($"x")+sqrt($"y")+$"z"*constantK ))

The operator '*' is supported. Take a look to https://spark.apache.org/docs/2.3.0/api/sql/index.html

Upvotes: 2

Related Questions