Spark: Merge columns of the same dataframe without creating additional dataframes

Question

I have the following data frame

+--------------------+-------------------+-------------+
|                uid2|               uid1|    timestamp|
+--------------------+-------------------+-------------+
|a                   |b                  |1589505008851|
|c                   |d                  |1589505012502|
|e                   |f                  |1589505016153|
+--------------------+-------------------+-------------+

and I want to create something like that

+--------------------+-------------------+
|                uids|          timestamp|
+--------------------+-------------------+
|a                   |1589505008851      |
|c                   |1589505012502      |
|e                   |1589505016153      |
|b                   |1589505008851      |
|d                   |1589505012502      |
|f                   |1589505016153      |
+--------------------+-------------------+

so I would like to merge the uid1 and uid2 columns into one column. The columns have the exact same length and they are of the same data type. Can I do this without creating an additional dataframe and "unioning" the two? Just by referencing the columns?

Raphael Roth · Accepted Answer

use the explode / array -approach :

df
.select(explode(array($"uid1",$"uid2")).as("uids"),$"timestamp")
.show()

Spark: Merge columns of the same dataframe without creating additional dataframes

Answers (2)

Related Questions