Reputation: 193
I have a dataframe (df) of the following form:
+-----+----- +
|id |items |
+-----+----- +
| 0 | item1 |
| 1 | item2 |
+-----+----- +
Here first column id is an int and second column items is of type struct. Lets say item is as shown:
item1
|-a
|-b
|-c
|-d
I want the resultant table of the form
+-----+----- +
|id |col2 |
+-----+----- +
| 0 | a |
| 0 | b |
| 0 | c |
| 0 | d |
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
+-----+----- +
I want to expand struct for every column?
How to do it?
Upvotes: 0
Views: 615
Reputation: 3544
This peice of code may solve your problem:
df.rdd.flatMap{row=>
val id=row.getInt(0)
val arrayOfString=row.getAs[Array[String]](1)
arrayOfString.map(value=>(id,value)
}.toDF("id","col2")
Note: this code is not tested !
Upvotes: 1