Reputation: 275
I have a Hive table with data in the below format
day class start_time count kpi1 kpi2 kpi3 kpi4 ... kpi160
-----------------------------------------------------------------------
20161010 abc 00 12 1 0 null 0 ...
I want to write a hive query to fetch the data in the format below
using some calculations like max
, min
, and avg
.
day class start_time count kpi_name kpi_max kpi_min kpi_avg
-----------------------------------------------------------------------
20161010 abc 00 12 kpi1 max(kpi1) min(kpi1) avg(kpi1)
20161010 abc 00 12 kpi2 max(kpi2) min(kpi2) avg(kpi2)
Please suggest a solution to fetch the data in this format.
Thanks.
Upvotes: 1
Views: 591
Reputation: 17585
You need to put all the kpi
s in a map, explode the map to create one column, and then aggregate.
Example:
Data:
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|day_ |class |start_time |count_ |kpi0 |kpi1 | kpi2 | kpi3 | kpi4 | kpi5 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |1 |2 |3 |8 |9 |6 |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
|20161010 |abc |00 |12 |4 |5 |null |6 |10 |null |
+---------+------+-----------+-------+-----+-----+------+------+------+------+
Query:
SELECT day_
, class
, start_time
, count_
, kpi_type
, MAX(vals) AS max_vals
, MIN(vals) AS min_vals
, AVG(vals) AS avg_vals
FROM (
SELECT day_, class, start_time, count_, kpi_type, vals
FROM database.table
LATERAL VIEW EXPLODE(MAP('kpi0', kpi0
, 'kpi1', kpi1
, 'kpi2', kpi2
, 'kpi3', kpi3
, 'kpi4', kpi4
, 'kpi5', kpi5)) et AS kpi_type, vals ) x
GROUP BY day_, class_, start_time, count_, kpi_type
Output:
+---------+------+-----------+-------+---------+---------+---------+---------+
|day_ |class |start_time |count_ |kpi_type |max_vals |min_vals |avg_vals |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi0 |4 |1 |2.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi1 |5 |2 |3.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi2 |3 |3 |3.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi3 |8 |6 |7.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi4 |10 |9 |9.5 |
+---------+------+-----------+-------+---------+---------+---------+---------+
|20161010 |abc |00 |12 |kpi5 |6 |6 |6.0 |
+---------+------+-----------+-------+---------+---------+---------+---------+
Upvotes: 1
Reputation: 2089
If you want to get the min,max,avg , you have to specify the group By column , let's suppose you want to group by day.
SELECT day,
class,
start_time,
count,
kpi1,
MAX(kpi1) as max_kpi1,
MIN(kpi1) as min_kpi1,
AVG(kpi1) as avg_kpi1
FROM table
GROUP BY day
Upvotes: 0