voidpro
voidpro

Reputation: 1672

How to display mismatched report with a label in spark 1.6 - scala except function?

Consider there are 2 dataframes df1 and df2.

df1 has below data

 A | B
-------
 1 | m
 2 | n
 3 | o

df2 has below data

 A | B
-------
 1 | m
 2 | n
 3 | p

df1.except(df2) returns

 A | B
-------
 3 | o
 3 | p

How to display the result as below?

df1:  3 | o
df2:  3 | p

Upvotes: 1

Views: 56

Answers (1)

hagarwal
hagarwal

Reputation: 1163

As per the API docs df1.except(df2), Returns a new DataFrame containing rows in this frame but not in another frame. i.e, it will return rows that are in DF1 and not in DF2. Thus a custom except function could be written as:

def except(df1: DataFrame, df2: DataFrame): DataFrame = {
  val edf1 = df1.except(df2).withColumn("df", lit("df1"))
  val edf2 = df2.except(df1).withColumn("df", lit("df2"))
  edf1.union(edf2)
}
//Output
+---+---+---+
|  A|  B| df|
+---+---+---+
|  3|  o|df1|
|  3|  p|df2|
+---+---+---+

Upvotes: 2

Related Questions