Ga999
Ga999

Reputation: 71

How to change column of array to separate columns in spark scala?

I have a column with arrays in it:

"subscriberPhoneNbrs" : [
        {
            "phoneType" : "HOM",
            "phoneNbr" : "9045682704"
        },
        {
            "phoneType" : "WRK",
            "phoneNbr" : "9045749004"
        }
    ]

I want to separate the array and give as different columns as below:

"subWorkPhone" : "9045682704",
"subHomePhone" : "9045749004",

Tried using explode function but I am not getting expected result.

Upvotes: 1

Views: 286

Answers (1)

Michel Lemay
Michel Lemay

Reputation: 2094

You can generate a list of columns to select:

case class Phone(phoneType: String, phoneNbr: String)

val df = List((0, List(Phone("HOM", "1234"), Phone("WRK", "5678")))).toDF("id", "subscriberPhoneNbrs")
df.show(false)

val dfMap = df.select(map_from_entries($"subscriberPhoneNbrs") as "phoneMap")

val renameMap = Map("WRK" -> "subWorkPhone", "HOM" -> "subHomePhone")
val newCols = renameMap.map(kv => col(s"phoneMap.${kv._1}").alias(kv._2)).toList

dfMap.select(newCols: _*).show

Will result in the following:

+------------+------------+
|subWorkPhone|subHomePhone|
+------------+------------+
|        5678|        1234|
+------------+------------+

map_from_entries doc :

static Column   map_from_entries(Column e)
Returns a map created from the given array of entries.

Upvotes: 3

Related Questions