Mehmet Ates
Mehmet Ates

Reputation: 133

Hive: Select rows with max value from a column

I want to select all rows for which the timestamp column has the maximum value. The data looks like this:

A      B      timestamp
john   smith   2018
bob    dylan   2018
adam   levine  2017
bob    dylan   2017

The result should be:

A      B      timestamp
john   smith   2018
bob    dylan   2018

With Impala, the following SQL Query works: SELECT * FROM table WHERE timestamp=(SELECT Max(timestamp) from table)

But with Hive, the SQL Query doesn't.

Upvotes: 3

Views: 14759

Answers (2)

ahmed2512
ahmed2512

Reputation: 74

You can use rank :

SELECT
    D.A
    ,D.B
    ,D.timestamp
FROM 
    (
        SELECT
            A
            ,B
            ,timestamp
            ,RANK(timestamp) over ( order by timestamp DESC ) as rank_
        FROM 
            TABLE
    ) D 
WHERE
    D.rank_=1

Upvotes: 0

Harold
Harold

Reputation: 463

Please always include the error message.

Try with

SELECT * FROM table WHERE timestamp IN (SELECT Max(timestamp) from table)

Upvotes: 4

Related Questions