Reputation: 594
I'm trying to do something like below with Hive. How can I have a column in Hive be defined as a subquery? Is this possible in Hive?
hive -e "
select
distinct i.SearchListingID,
(select count(*)
from calls c
where c.ServiceID = i.SearchListingID
) as CallsCount
from Impressions i
where i.yyyymmdd = 20120401
limit 10" > ImpressionCalls.txt
Hive history file=/tmp/jd/hive_job_log_jd_201205222049_550931420.txt
FAILED: Parse Error: line 4:1 cannot recognize input near 'select' 'count' '(' in expression specification
Upvotes: 1
Views: 24318
Reputation: 4080
Correlated subqueries are not supported in Hive. How about something like this instead? (I didn't get a chance to verify this query on Hive myself)
select
i.SearchListingID,
count(*)
from
(
select
distinct i.SearchListingID as SearchListingID
from
Impressions i
where
i.yyyymmdd = 20120401
)i
join
calls c
on(c.ServiceID = i.SearchListingID)
limit 10
Upvotes: 9