Reputation: 607
I am trying to compute the avg/min/max of the record count for a view in hive
select avg(rec_count), max(rec_count), min(rec_count)
And trying to implement something like:
select avg(rec_count), max(rec_count), min(rec_count)
where rec_count in
(select count(record_number) as rec_count from archives
group by record_number);
But I'm getting an error message:
FAILED: SemanticException Line 0:-1 Invalid column reference 'rec_count' in definition of SubQuery sq_1 [ rec_count in (select count(record_number) as rec_count from archives group by record_number) ] used as sq_1 at Line 2:18
I would like to know if I can do this without having to create another table/view that contains the grouped/aggregated values of the original table.
Sample data:
Record number 3031 has 4 records, 4050 has 6 records so the expected results that I'm hoping to get from the query would be:
avg = 5 min = 4 max = 6
Upvotes: 0
Views: 97
Reputation: 1271141
I think you want something like:
select avg(rec_count), max(rec_count), min(rec_count)
from (select record_number, count(*) as rec_count
from archives
group by record_number
) a
Upvotes: 1