BigD
BigD

Reputation: 888

Group by fields in HIVE to get all columns using Hive

I have the following data set as below.I want to get the certain column values based on the group by (or some other functions) on specific columns. My data set looks like:

id  zip  Action  content  duration  OS    TIME
================================================
1  11    START   DELL               LINUX   12
1  11    JUMP    HP                 UNIX    14
1  11    STOP    HP       10        LINUX   16
1  11    START   WIN               LINUX    2
1  11    JUMP    HP                 UNIX    4
1  11    STOP    SONY     12        LINUX   15
2  12    START   HP                 UNIX    3
2  12    STOP    FOP      2         WINDOWS 10
--------------------------------------------

I want to get the all column values based on same (id,zip) group where Action='STOP' and max time out of filtered records. my expected output would be:

id  zip  Action  content  duration  OS
========================================
1  11    STOP    HP       10        LINUX

2  12    STOP    FOP      2         WINDOWS
--------------------------------------------

How can i achieve the same using HIVE? please help.

Upvotes: 0

Views: 2972

Answers (1)

David דודו Markovitz
David דודו Markovitz

Reputation: 44921

row_number

select  id,zip,Action,content,duration,OS

from   (select  *
               ,row_number() over
                (
                    partition by    id,zip
                    order by        time desc
                )   as rn

        from    mytable

        where   action = 'STOP'
        ) t

where   rn = 1

+----+-----+--------+---------+----------+---------+
| id | zip | action | content | duration |   os    |
+----+-----+--------+---------+----------+---------+
|  1 |  11 | STOP   | HP      |       10 | LINUX   |
|  2 |  12 | STOP   | FOP     |        2 | WINDOWS |
+----+-----+--------+---------+----------+---------+

Upvotes: 1

Related Questions