Giorgio
Giorgio

Reputation: 1093

access scala map from dataframe without using UDFs

I have a Spark (version 1.6) Dataframe, and I would like to add a column with a value contained in a Scala Map, this is my simplified code:

val map = Map("VAL1" -> 1, "VAL2" -> 2)
val df2 = df.withColumn("newVal", map(col("key")))

This code doesn't work and obviously I receive the following error, because the map expecting a String value, while receiving a column:

found   : org.apache.spark.sql.Column
required: String

The only way I could do that is using an UDF:

val map = Map("VAL1" -> 1, "VAL2" -> 2)
val myUdf = udf{ value:String => map(value)}
val df2 = df.withColumn("newVal", myUdf($"key"))

I want avoid the use of UDFs if possible.

Are there any other solutions available using just the DataFrame API (I would like also to avoid transforming it to RDD)?

Upvotes: 2

Views: 464

Answers (2)

mattinbits
mattinbits

Reputation: 10428

You could convert the Map to a Dataframe and use a JOIN between this and your existing dataframe. Since the Map dataframe would be very small, it should be a Broadcast Join and avoid the need for a shuffle phase.

Letting Spark know to use a broadcast join is described in this answer: DataFrame join optimization - Broadcast Hash Join

Upvotes: 2

user9811196
user9811196

Reputation:

TL;DR Just use udf.

With the version you use (Spark 1.6 according to your comment) there is no solution which doesn't require udf or map over RDD / Dataset.

In later versions you can:

  • use map functions (2.0 or later) to create literal MapType column

    import org.apache.spark.sql.functions
    
    val map = functions.map(
       Map("VAL1" -> 1, "VAL2" -> 2)
         .flatMap { case (k, v) =>  Seq(k, v) } .map(lit) .toSeq: _*
    )
    map($"key")
    
  • typedLit (2.2 or later) to create literal MapType column.

    val map = functions.typedLit(Map("VAL1" -> 1, "VAL2" -> 2))
    map($"key")
    

and use these directly.

Reference How to add a constant column in a Spark DataFrame?

Upvotes: 3

Related Questions