Reputation: 73
I am trying to perform IF
[spark's coalesce] on top of a left outer joined output, but seems that NULL
is not getting handled as expected. Here are my base tables, sample query, output and expected output-
Base tables:
t1:
a,100
b,101
c,102t2:
101
Query:
select a.x, a.x1, IF(b.x1 is NULL,a.x1,b.x1) from t1 a LEFT OUTER JOIN t2 b on a.x1=b.x1;
Output:
a,100,null
b,101,101
c,102,null
Expected:
a,100,100
b,101,101
c,102,102
I have also tried wrapping the above query and then performing an IF on top of it. But with no success. Please suggest is I am missing something.
Upvotes: 3
Views: 17023
Reputation: 1653
This seems to be working
File: tbl1
1 a
2 b
3 c
File: tbl2
1 c
3 d
case class c_tbl1(c1: String,c2: String)
sc.textFile("tbl1").map { row =>
val parts = row.split("\t")
c_tbl1(parts(0),parts(1)) }.registerTempTable("t_tbl1")
case class c_tbl2(c1: String,c2: String)
sc.textFile("tbl2").map { row =>
val parts = row.split("\t")
c_tbl2(parts(0),parts(1)) }.registerTempTable("t_tbl2")
sqlContext.sql("""select t.c1,t.c2,IF(t2.c1 is null,1,2),t2.c2 from t_tbl1 t left outer join t_tbl2 t2 on t.c1=t2.c1""".stripMargin).collect.foreach(println)
[1,a,2,c]
[2,b,1,null]
[3,c,2,d]
Upvotes: 1
Reputation: 8497
Try Case statement, not sure this CASE statement is supported by Spark SQL:-
select a.x, a.x1,
CASE WHEN b.x1 IS NULL THEN a.x1
ELSE b.x1
END as bx1
from t1 a LEFT OUTER JOIN t2 b on a.x1=b.x1;
Upvotes: 1