Reputation: 303
There are two json and first json has more column and always it is super set.
val df1 = spark.read.json(sqoopJson)
val df2 = spark.read.json(kafkaJson)
Except Operation :
I like to apply except operation on both df1 and df2, But df1 has 10 column and df2 has only 8 columns. In case manually if i drop 2 column from df1 then except will work. But I have 50+ tables/json and need to do EXCEPT for all 50 set of tables/json.
Question :
How to select only columns available in DF2 ( 8) columns from DF1 and create new df3? So df3 will have data from df1 with limited column and it will match with df2 columns.
Upvotes: 0
Views: 3347
Reputation: 23109
For the Question: How to select only columns available in DF2 ( 8) columns from DF1 and create new df3?
//Get the 8 column names from df2
val columns = df2.schema.fieldNames.map(col(_))
//select only the columns from df2
val df3 = df1.select(columns :_*)
Hope this helps!
Upvotes: 6