TomG
TomG

Reputation: 301

Left outer join query with Null check in hive is not working

I have written a query to find records which exists in one table but not in the other. Here is the query

CREATE TABLE kai_prod.Held1Miss AS 
SELECT distinct jndheld.icerecordid from 
(
SELECT held1.icerecordid,held2.icerecordid icerecordid2 
FROM kai_prod.tblheld116 held1 LEFT OUTER JOIN
kai_prod.tblheld216 held2 ON
held1.icerecordid = held2.icerecordid
) jndheld
WHERE jndheld.icerecordid2 IS NULL;

I am getting a count of 266585 in kai_prod.Held1Miss

however when i double check using the below query , I found that the above count is wrong . The above count should have been zero

select count(*) from kai_prod.tblheld116 INNER JOIN
kai_prod.Held1Miss ON
tblheld116.icerecordid = Held1Miss.icerecordid;

266585

is this a bug in hive or issue with NUll?

please find the version details.

We are running hadoop with spark on Microsoft HDInsight distribution in Azure

hive --version Hive 1.2.1000.2.6.2.3-1 Subversion git://ctr-e134-1499953498516-137771-01-000004.hwx.site/grid/0/jenkins/workspace/HDP-parallel-ubuntu16/SOURCES/hive -r 4e24791861b0ce72f1185e572ffe1fd7f737dfdd Compiled by jenkins on Thu Sep 14 08:04:15 UTC 2017 From source with checksum 77da03c67efbe84d5e5b69bd4d91ad54

hadoop version Hadoop 2.7.3.2.6.2.3-1 Subversion [email protected]:hortonworks/hadoop.git -r 1ceeb58bb3bb5904df0cbb7983389bcaf2ffd0b6 Compiled by jenkins on 2017-09-14T07:25Z Compiled with protoc 2.5.0 From source with checksum 90b73c4c185645c1f47b61f942230 This command was run using /usr/hdp/2.6.2.3-1/hadoop/hadoop-common-2.7.3.2.6.2.3-1.jar

Upvotes: 0

Views: 1120

Answers (2)

顾泽鹏
顾泽鹏

Reputation: 11

i think your second sql maybe wrong:

select count(*) from kai_prod.tblheld116 INNER JOIN
kai_prod.Held1Miss ON
tblheld116.icerecordid = Held1Miss.icerecordid;

you should join the table kai_prod.tblheld216 rather than kai_prod.tblheld116 to double check the null key

Upvotes: 1

Rahul
Rahul

Reputation: 409

Left Outer Join will always produce rows from left table.

Please use inner join for getting required results.

Thank you

Upvotes: 1

Related Questions