Don P
Don P

Reputation: 63708

What is the HQL equivalent of NOT IN?

I am trying to use a NOT IN keyword in Hive Query Language. It seems to be giving me an error:

SELECT Name
FROM names_in_countries
WHERE Country = 'Mexico'
AND Name NOT IN (
    SELECT Name
    FROM names_in_countries
    WHERE Country <> 'Mexico')

Here is the original question with answer in SQL.

Upvotes: 0

Views: 1747

Answers (2)

Mark Grover
Mark Grover

Reputation: 4080

Something like this might work.

SELECT t1.Name 
FROM   names_in_countries t1 
       LEFT OUTER JOIN (SELECT Name, 
                               Country 
                        FROM   names_in_countries 
                        WHERE  Country <> 'Mexico') t2 
         ON ( t1.Name = t2.Name ) 
WHERE  t1.Country = 'Mexico' 
       AND t2.Country IS NULL 

It's crucial to add the country <> 'Mexico' on t2 as a sub-select because records retrieved from t2 change after the LEFT OUTER JOIN performed. They appear as NULLs if there isn't a corresponding entry to the record from t1.

Upvotes: 1

user756519
user756519

Reputation:

I don't know Hive Query Language. Based on what I have read here: Hive Queries on Tables, following script might work. Give it a try.

Script:

SELECT      name
FROM        mytable
GROUP BY    name
HAVING      AVG((CASE WHEN country = 'Mexico' THEN 1 ELSE 0 END) * 1.) >= 1

Upvotes: 2

Related Questions