Reputation: 23
I have the following dataframe
val count :Dataframe = spark.sql("select 1,$database_name,$table_name count(*) from $table_name ")
Output :
1,stock,T076p,4332
val dist_count :Dataframe = spark.sql("1,select distinct count(*) from $table_name")`
Output :
4112 or 4332(can be same )
val truecount : Dataframe = spark.sql("select 1,count(*) from $table_name where flag =true")`
Output :
4330
val Falsecount : DataFrame = spark.sql("select 1,count(*) from $table_name where flag =false")
Output :
4332
Question : How do I join above dataframe
to get the resultant dataframe
which give me Output.
as the below.
stock ,T076p, 4332,4332,4330
Here comma is for column separator
P.S - I have added 1 to every dataframe
so I can use join dataframes
(so 1 is not mandatory here.)
Upvotes: 1
Views: 247
Reputation: 29195
Question :
How do I join above dataframe to get the resultant dataframe which give me o/p as the below.stock ,T076p, 4332,4332,4330 -Here comma is for column seperator
just check this example. I have mimicked your requirement with dummy dataframes like below.
package com.examples
import org.apache.log4j.{Level, Logger}
import org.apache.spark.sql.SparkSession
object MultiDFJoin {
def main(args: Array[String]) {
import org.apache.spark.sql.functions._
Logger.getLogger("org").setLevel(Level.OFF)
val spark = SparkSession.builder.
master("local")
.appName(this.getClass.getName)
.getOrCreate()
import spark.implicits._
val columns = Array("column1", "column2", "column3", "column4")
val df1 = (Seq(
(1, "stock", "T076p", 4332))
).toDF(columns: _*).as("first")
df1.show()
val df2 = Seq((1, 4332)).toDF(columns.slice(0, 2): _*).as("second")
df2.show()
val df3 = Seq((1, 4330)).toDF(columns.slice(0, 2): _*).as("third")
df3.show()
val df4 = Seq((1, 4332)).toDF(columns.slice(0, 2): _*).as("four")
df4.show()
val finalcsv = df1.join(df2, col("first.column1") === col("second.column1")).selectExpr("first.*", "second.column2")
.join(df3, Seq("column1")).selectExpr("first.*", "third.column2")
.join(df4, Seq("column1"))
.selectExpr("first.*", "third.column2", "four.column2")
.drop("column1").collect.mkString(",") // this column used for just joining hence dropping
print(finalcsv)
}
}
Result :
+-------+-------+-------+-------+ |column1|column2|column3|column4| +-------+-------+-------+-------+ | 1| stock| T076p| 4332| +-------+-------+-------+-------+ +-------+-------+ |column1|column2| +-------+-------+ | 1| 4332| +-------+-------+ +-------+-------+ |column1|column2| +-------+-------+ | 1| 4330| +-------+-------+ +-------+-------+ |column1|column2| +-------+-------+ | 1| 4332| +-------+-------+ [stock,T076p,4332,4330,4332]
Upvotes: 0