agarwal_achhnera
agarwal_achhnera

Reputation: 2456

how to calculate cumulative salary in hive

as hive only support subqueries in from cluase like select *from (subquery) and also support only equjoin so how can We calculate cumulative salary from a table like table employee have records as below.

id     name       salary
e001  manish      10000
e002  amit         40000
e003  santosh     45000
e004  rohit       30000

so output should be like below

id       name           salary        cumsalary
e001     manish         10000          10000
e002     amit           40000          50000
e003     santosh        45000          95000
e004     rohit           30000         125000

how can I achive this in hive

Upvotes: 2

Views: 1956

Answers (2)

Pரதீப்
Pரதீப்

Reputation: 93734

Use Correlated Sub-Query to Find Cumulative Sum

 create table #temp(id varchar(10),    name  varchar(50),     salary int)
 insert #temp values 
('e001',  'manish',      10000),
('e002',  'amit   ',      40000),
('e003',  'santosh ',    45000),
('e004',  'rohit  ',     30000)

SELECT *,
       (SELECT Sum(salary)
        FROM   #temp t2
        WHERE  t2.id <= t.id) As cumsalary
FROM   #temp t 

Upvotes: 1

FuzzyTree
FuzzyTree

Reputation: 32402

Hive supports the sum() analytic function, so you should be able to do the follwing:

select t1.* ,
sum(salary) over (order by id) cumsalary
from table t1

For more info about Hive's analytic functions see https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics

Upvotes: 3

Related Questions