Reputation: 183
Here's a link to an example of what I want to achieve: https://community.powerbi.com/t5/Desktop/Append-Rows-using-Another-columns/m-p/401836. Basically, I need to merge all the rows of a pair of columns into another pair of columns. How can I do this in Spark Scala?
Input
Output
Upvotes: 2
Views: 42
Reputation: 7928
Correct me if I'm wrong, but I understand that you have a dataframe with 4 columns and you want two of them to be in the previous couple of columns right?
For instance with this input (only two rows for simplicity)
df.show
+----+----------+-----------+----------+---------+
|name| date1| cost1| date2| cost2|
+----+----------+-----------+----------+---------+
| A|2013-03-25|19923245.06| | |
| B|2015-06-04| 4104660.00|2017-10-16|392073.48|
+----+----------+-----------+----------+---------+
With just a couple of selects and a unionn you can achieve what you want
df.select("name", "date1", "cost1")
.union(df.select("name", "date2", "cost2"))
.withColumnRenamed("date1", "date")
.withColumnRenamed("cost1", "cost")
+----+----------+-----------+
|name| date| cost|
+----+----------+-----------+
| A|2013-03-25|19923245.06|
| B|2015-06-04| 4104660.00|
| A| | |
| B|2017-10-16| 392073.48|
+----+----------+-----------+
Upvotes: 1