Guforu
Guforu

Reputation: 4023

udf Function for DataType casting, Scala

I have next DataFrame:

df.show()

+---------------+----+
|              x| num|
+---------------+----+
|[0.1, 0.2, 0.3]|   0|
|[0.3, 0.1, 0.1]|   1|
|[0.2, 0.1, 0.2]|   2|
+---------------+----+

This DataFrame has follow Datatypes of columns:

df.printSchema 
root
 |-- x: array (nullable = true)
 |    |-- element: double (containsNull = true)
 |-- num: long (nullable = true)

I try to convert currently the DoubleArray inside of DataFrame to the FloatArray. I do it with the next statement of udf:

val toFloat = udf[(val line: Seq[Double]) => line.map(_.toFloat)]
val test = df.withColumn("testX", toFloat(df("x")))

This code is currently not working. Can anybody share with me the solution how to change the array Type inseide of DataFrame?

What I want is:

df.printSchema 
root
 |-- x: array (nullable = true)
 |    |-- element: float (containsNull = true)
 |-- num: long (nullable = true)

This question is based on the question How tho change the simple DataType in Spark SQL's DataFrame

Upvotes: 0

Views: 1597

Answers (1)

cheseaux
cheseaux

Reputation: 5315

Your udf is wrongly declared. You should write it as follows :

val toFloat = udf((line: Seq[Double]) => line.map(_.toFloat))

Upvotes: 1

Related Questions