Reputation: 108
I'm trying to find the number of employees who are paid less than average wage.
I'm pretty new to hive and struggling a bit, could someone explain whats wrong with my statement and help me out please?
My statement -
SELECT COUNT(*) FROM(SELECT wage, AVG(wage) AS avgWage FROM emp_wages) WHERE wage < avgWage;
The error -
ParseException line 1:82 cannot recognize input near 'where' 'wage' '<' in subquery source
Any help appreciated!
Upvotes: 0
Views: 2235
Reputation: 132
The correct query would be:
select count(*) where wage <(select avg(wage) from emp_wages);
You are getting a parsing error as wage and avgWage is in subquery.
Upvotes: 0
Reputation: 49260
A syntax error. Derived table should be aliased.
SELECT COUNT(*)
FROM (SELECT wage, AVG(wage) AS avgWage FROM emp_wages group by wage) t --alias needed here
WHERE wage < avgWage;
Query wise, it needs a change.
select count(*)
from (SELECT wage, AVG(wage) over() AS avgWage
FROM emp_wages
) t
where wage < avgWage
Upvotes: 2
Reputation: 525
The problem is AVG is an aggregation function. If you want to map one to many relations, you need to use a cross join function:
select
count(*), avg(v1.wage),
sum(case when v.wage < v2.avgwage then 1 else 0 end) below_average
from
emp_wages v cross join (select avg(wage) as avgwage from emp_wages) as v2
Upvotes: 0
Reputation: 630
SELECT COUNT(*)
FROM (SELECT wage, AVG(wage) AS avgWage FROM emp_wages group by wage)avg --group by needed
WHERE wage < avgWage;
Upvotes: 0