DBA108642
DBA108642

Reputation: 2112

Merge 4 dataframes into one

I have 4 dataframes which only have one row and one column, and I would like to combine them into one dataframe. In python i would do this using the zip function but I need a way to do it in pyspark. Any suggestions?

Dataframes look like this:

+--------------------------+
|sum(sum(parcelUBLD_SQ_FT))|
+--------------------------+
|              1.13014806E8|
+--------------------------+

+---------------------+
|sum(parcelUBLD_SQ_FT)|
+---------------------+
|         1.13014806E8|
+---------------------+

+---------------+
|count(parcelID)|
+---------------+
|          45932|
+---------------+

+----------------+
|sum(parcelCount)|
+----------------+
|           45932|
+----------------+

and I would like it to look like this:

+--------------------------+---------------------+---------------+----------------+
|sum(sum(parcelUBLD_SQ_FT))|sum(parcelUBLD_SQ_FT)|count(parcelID)|sum(parcelCount)|
+--------------------------+---------------------+---------------+----------------+
|              1.13014806E8|         1.13014806E8|          45932|           45932|
+--------------------------+---------------------+---------------+----------------+

Upvotes: 0

Views: 110

Answers (1)

Ranga Vure
Ranga Vure

Reputation: 1932

Since, you clearly specified all dataframes are having one row, you can use cross join to get the desired output

df1.crossJoin(df2).crossJoin(df3).crossJoin(df4)

Upvotes: 1

Related Questions