Edamame
Edamame

Reputation: 25366

Take only the first (or nth) element after explode

Is that possible to only take the first element after the explode() function?

+-----------------------------------------+
|dog                                      |
+-----------------------------------------+
|[[Max,Black],3]                          |
|[[Spot,White],2]                         |
|[[Michael,Yellow],1]                     |
+-----------------------------------------+

For example, in the above case we only want to keep [Max, Black], [Spot, white] and [Michael,Yellow]. The second elements in each cell (3, 2, and 1) can actually be discarded.

Thanks!

Upvotes: 0

Views: 1393

Answers (1)

David Griffin
David Griffin

Reputation: 13927

Assuming your schema looks something like this:

root
 |-- dog: struct (nullable = false)
 |    |-- col1: struct (nullable = false)
 |    |    |-- col1: string (nullable = false)
 |    |    |-- col2: string (nullable = false)
 |    |-- col2: integer (nullable = false)

Then you could do the following:

test.withColumn("dog", $"dog".getField("col1"))

Or if you wanted to keep both columns, do:

test.select($"dog", $"dog".getField("col1") as "dog2")

Upvotes: 1

Related Questions