Raunak Jhawar
Raunak Jhawar

Reputation: 1651

Subquery in Hive Where Clause

I need help with this piece:

How can I write the following in HIVE...

SELECT * 
FROM tableA 
WHERE colA = (SELECT MAX(date_column) FROM tableA)

I need to query only the latest current records from the table. I am storing dates as strings in hive as "yyyy-mm-dd".

Upvotes: 1

Views: 3010

Answers (3)

Ary Jazz
Ary Jazz

Reputation: 1676

Something like this might work:

SELECT a.* 
FROM tableA a
JOIN (SELECT MAX(date_column) AS max_date_column
      FROM tableA) b
ON a.colA = b.max_date_column

Hope it helps

EDIT: I have no idea how I got to this old question, you probably solved it long ago :)

Upvotes: 1

Kim Moritz
Kim Moritz

Reputation: 163

Note that in Hive 0.13+, you can use subqueries in WHERE statements.

Upvotes: 0

libjack
libjack

Reputation: 6443

Avoid a JOIN, use the analytics and windowing features:

select * from (select *, rank() over (order by date_col desc) as rank from tableA) S where S.rank = 1;

Upvotes: 1

Related Questions