Reputation: 2456
as hive only support subqueries in from cluase like select *from (subquery) and also support only equjoin so how can We calculate cumulative salary from a table like table employee have records as below.
id name salary e001 manish 10000 e002 amit 40000 e003 santosh 45000 e004 rohit 30000
so output should be like below
id name salary cumsalary e001 manish 10000 10000 e002 amit 40000 50000 e003 santosh 45000 95000 e004 rohit 30000 125000
how can I achive this in hive
Upvotes: 2
Views: 1956
Reputation: 93734
Use Correlated Sub-Query
to Find Cumulative Sum
create table #temp(id varchar(10), name varchar(50), salary int)
insert #temp values
('e001', 'manish', 10000),
('e002', 'amit ', 40000),
('e003', 'santosh ', 45000),
('e004', 'rohit ', 30000)
SELECT *,
(SELECT Sum(salary)
FROM #temp t2
WHERE t2.id <= t.id) As cumsalary
FROM #temp t
Upvotes: 1
Reputation: 32402
Hive supports the sum()
analytic function, so you should be able to do the follwing:
select t1.* ,
sum(salary) over (order by id) cumsalary
from table t1
For more info about Hive's analytic functions see https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
Upvotes: 3