Reputation: 12953
Given a table with columns such as:
Date, Type
I'm running the following SQL:
SELECT Type, count(*) as CountPerType
FROM myTable
WHERE Date between 20200101 and 20200131
GROUP BY count(*)
I want to have an extra column Percentage
which will have 100.0 * CountPerType / SUM(CountPerType)
. What's the most efficient way to do it in PrestoDB (which powers Amazon Athena)?
Upvotes: 3
Views: 11920
Reputation: 1269503
I would write the query without a subquery. You can mix window functions and aggregation functions:
SELECT Type, COUNT(*) as CountPerType,
COUNT(*) * 100.0 / SUM(COUNT(*)) OVER () as percentage
FROM t
WHERE Date BETWEEN 20200101 AND 20200131
GROUP BY Type;
I don't know if the performance is different from the version using the subquery (this should be at least as good). But the query is definitely simpler.
Upvotes: 12
Reputation: 7503
You can use window functionality to achieve that. You should always do a group by of non-aggregated fields.
select
Type,
CountPerType,
100.0 * CountPerType/sum(CountPerType) over () as columnName
from
(
SELECT
Type,
count(*) as CountPerType
FROM myTable
WHERE Date between 20200101 and 20200131
GROUP BY
Type
) subq
Upvotes: 2