Reputation: 888
I have the following data set as below.I want to get the certain column values based on the group by (or some other functions) on specific columns. My data set looks like:
id zip Action content duration OS TIME
================================================
1 11 START DELL LINUX 12
1 11 JUMP HP UNIX 14
1 11 STOP HP 10 LINUX 16
1 11 START WIN LINUX 2
1 11 JUMP HP UNIX 4
1 11 STOP SONY 12 LINUX 15
2 12 START HP UNIX 3
2 12 STOP FOP 2 WINDOWS 10
--------------------------------------------
I want to get the all column values based on same (id,zip) group where Action='STOP' and max time out of filtered records. my expected output would be:
id zip Action content duration OS
========================================
1 11 STOP HP 10 LINUX
2 12 STOP FOP 2 WINDOWS
--------------------------------------------
How can i achieve the same using HIVE? please help.
Upvotes: 0
Views: 2972
Reputation: 44921
row_number
select id,zip,Action,content,duration,OS
from (select *
,row_number() over
(
partition by id,zip
order by time desc
) as rn
from mytable
where action = 'STOP'
) t
where rn = 1
+----+-----+--------+---------+----------+---------+
| id | zip | action | content | duration | os |
+----+-----+--------+---------+----------+---------+
| 1 | 11 | STOP | HP | 10 | LINUX |
| 2 | 12 | STOP | FOP | 2 | WINDOWS |
+----+-----+--------+---------+----------+---------+
Upvotes: 1