Ben C Wang
Ben C Wang

Reputation: 607

How do I return a subquery result as a column name for another query?

I am trying to compute the avg/min/max of the record count for a view in hive

select avg(rec_count), max(rec_count), min(rec_count)

And trying to implement something like:

select avg(rec_count), max(rec_count), min(rec_count)
where rec_count in 
(select count(record_number) as rec_count from archives
group by record_number);

But I'm getting an error message:

FAILED: SemanticException Line 0:-1 Invalid column reference 'rec_count' in definition of SubQuery sq_1 [ rec_count in (select count(record_number) as rec_count from archives group by record_number) ] used as sq_1 at Line 2:18

I would like to know if I can do this without having to create another table/view that contains the grouped/aggregated values of the original table.

Sample data:

Blockquote

Record number 3031 has 4 records, 4050 has 6 records so the expected results that I'm hoping to get from the query would be:

avg = 5 min = 4 max = 6

Upvotes: 0

Views: 97

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271141

I think you want something like:

select avg(rec_count), max(rec_count), min(rec_count)
from (select record_number, count(*) as rec_count
      from archives
      group by record_number
     ) a

Upvotes: 1

Related Questions