Reputation: 8395
I need to find a Hive query that returns the 2 top names for each 6 minutes interval since 00:00:00,
Data looks like -
Name Time
A 00:00:00
B 00:03:53
C 00:01:16
A 00:04:34
A 00:07:32
A 00:18:36
C 00:16:12
C 00:05:04
B 00:01:50
B 00:12:05
A 00:11:20
B 00:04:27
B 00:02:47
A 00:00:23
A 00:00:23
B 00:36:21
B 00:02:46
I would like to write the query in Hive which is very new for me but even using mysql query , I can derive the result in Hive.
Upvotes: 0
Views: 89
Reputation: 3897
select
*
from
(
select NAME
, time_interval_6
, rank() over (partition by NAME, time_interval_6 order by ct desc) as ranking
from
(select count(1) as ct
, NAME
, floor((floor(cast(substring(time,1,2) as int)*60 + cast(substring(time,4,2) as int)))/6) as time_interval_6
FROM MY_TABLE
group by NAME, floor((floor(cast(substring(time,1,2) as int)*60 + cast(substring(time,4,2) as int)))/6)
) a
)b
where ranking <= 2
;
Upvotes: 1