How to join two data frames and add fields in spark

Question

I have two dataframes for example,

DF1
+----+-------+------+------+
|id  | value1|value2|value3|
+----+-------+------+------+
|  1 |10     |20    |30    |
|  2 |20     |30    |40    |
|  3 |30     |40    |50    |
+----+-------+------+------+

DF2
+----+-------+------+------+
|id  | value1|value2|value3|
+----+-------+------+------+
|  1 |100    |200   |300   |
|  2 |200    |300   |400   |
|  3 |300    |400   |500   |
+----+-------+------+------+

I want to merge these two dataframes by id and at the same time add the columns with the same name in the two dataframes together. So what I want is something like this

+----+-------+------+------+
|id  | value1|value2|value3|
+----+-------+------+------+
|  1 |100+10 |200+20|300+30|
|  2 |200+20 |300+30|400+40|
|  3 |300+30 |400+40|500+50|
+----+-------+------+------+

Merging part is fairly easy with spark

df1.join(df2,Seq("id"))

Is there an efficient way to add the two columns with the same name in the two dataframes together as above with scala and spark.

How to join two data frames and add fields in spark

Answers (1)

Related Questions