Reputation: 379
I was trying to parameterise the value in hive rather than hard coding it in query. Below is the query.
select * from employee where sal >30000
But rather than using 30000 value as hard coded I need that to come from the same query like below. But I am running into issues :
select * from employee where sal > (select max(sal) from employee)
Any help is appreciated.
Thanks
Upvotes: 2
Views: 688
Reputation: 38335
Hive does not support such subqueries and also does not allow to calculate a variable, variables in Hive are simple text substitution without calculation. You can calculate predicate in a shell and pass to your hive script like in this answer: https://stackoverflow.com/a/37821218/2700344
If you want to do it in the same hive query, nothing wrong in calculating subquery and do a cross join with it's result, then filter. subquery will be calculated first, then it's result placed in the distributed cache and applied in the filter in each mapper reading the table:
with sub as(--this is example only and makes no sense
--replace with real query
--of course there is no rows with sal>max sal in the same table
select max(S.sal) AS MaxSal from employee S
)
select *
from employee e
cross join sub s
where e.sal>s.MaxSal
If you write it without CROSS JOIN
, simply from employee e, sub s
, or JOIN without on condition, it is still the same cross join, better write it explicitly using cross join.
Upvotes: 1
Reputation: 4754
You can try using this form of Hive query. This will get the employees having salary equal to the highest salary.
SELECT e1.* FROM employee e1
JOIN
(SELECT MAX(sal) as max_sal FROM employee) e2
ON e1.sal = e2.max_sal;
Example:
Table: employee
id fname sal
1 AAA 15000
2 BBB 35000
3 CCC 12000
4 DDD 35000
5 EEE 9000
Query execution output:
2 BBB 35000
4 DDD 35000
Upvotes: 1