KAY_YAK
KAY_YAK

Reputation: 303

How to split two mapped columns in a dataframe simultaneously in spark

I have a dataframe of this form

+--------------------------------------+-----------------------------+
|hashMap                               |name                         |
+--------------------------------------+-----------------------------+
|[{"A":"0","B":"0","C":"0"}, {"X":"0"}]|[M, D]                       |
+--------------------------------------+-----------------------------+

I want to split it to this

+--------------------------------------+-----------------------------+
|hashMap                               |name                         |
+--------------------------------------+-----------------------------+
|"A":"0","B":"0","C":"0"               | M                           |
|"X":"0"                               | D                           |
+--------------------------------------+-----------------------------+

I know explode split, but I don't know if it will work on two columns. Also, its possible that sometimes there is only one value in both the columns

for example

+-----------+-----------+
|hashMap    |name       |
+-----------+-----------+
|[{"A":"0"} |[M]        |
+-----------+-----------+

How do make the explode split generic to handle this in scala?

Upvotes: 0

Views: 63

Answers (1)

mck
mck

Reputation: 42332

You can zip the arrays and explode them using inline:

val df2 = df.selectExpr("inline(arrays_zip(hashMap, name))")

Upvotes: 1

Related Questions