harshit_sharan
harshit_sharan

Reputation: 141

SQL / Hive Select first rows with certain column value

Consider following 3 column table structure:

id, b_time, b_type

id is a string, there will be multiple rows with the same id in the table. b_time is timestamp and b_type can have any one of 2 possible values - 'A' or 'B'.

I want to select all the rows that fulfill one of the 2 conditions, priority wise:

  1. For all ids, select the row with highest timestamp, where b_type='A'.

  2. If for an id, there are no rows where b_type='A', select the row with highest timestamp, irrespective of the b_type value.

Please suggest the sql query which should tackle this problem(even if it requires creation of temporary intermediate tables).

Upvotes: 0

Views: 5455

Answers (2)

harshit_sharan
harshit_sharan

Reputation: 141

Figured out a simple and intuitive way to do this:

SELECT * FROM     
  (SELECT id
    , b_time
    , b_type
    , ROW_NUMBER() OVER (PARTITION BY id ORDER BY b_type ASC,b_time DESC) AS RN
    FROM your_table
  )    
WHERE RN = 1

Upvotes: 3

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

with nottypea as (select id, max(b_time) as mxtime 
                  from tablename 
                  group by id 
                  having sum(case when b_type = 'A' then 1 else 0 end) = 0)
, typea as (select id, max(b_time) as mxtime 
            from tablename 
            group by id 
            having sum(case when b_type = 'A' then 1 else 0 end) >= 1)
select id,mxtime,'B' as typ from nottypea
union all
select id,mxtime,'A' as typ from typea

Upvotes: 0

Related Questions