jiaying chen
jiaying chen

Reputation: 21

Join two dataframes in pyspark

I have two data frames:

df1

+----+----+
|key1|val1|
+----+----+
|a1  |   1|
|b1  |   2|
+----+----+

df2

+----+----+
|key2|val2|
+----+----+
|a2  |   3|
|b2  |   4|
+----+----+

And then I want to merge these two data frames to get the following data frame:

df3

+----+----+----+----+
|key1|val1|key2|val2|
+----+----+
|a1  |   1|a2  |   3|
|a1  |   1|b2  |   4|
|b1  |   2|a2  |   3|
|b1  |   2|b2  |   4|
+----+----+

How can I do this in PySaprk?

Upvotes: 0

Views: 60

Answers (1)

Saurabh
Saurabh

Reputation: 943

Try cross join as below,

df3 = df1.crossJoin(df2)
df3.show()

This should give output as you want.

Upvotes: 2

Related Questions