Reputation: 2017
I want to group the results of a count(*) query into value buckets. I'm testing this on the dellstore2 postgresql sample database. My query below returns the right answers, but does so once for each row in the table (several thousand identical results). I can fix this by adding LIMIT 1
the the end of the query but I'd like to understand why I'm getting the duplicates in case it points to a wider problem with my approach. The query is:
SELECT
(SELECT count(*)
FROM
orders
WHERE
totalamount > 0 AND totalamount <= 100) AS ">0 <= 100",
(SELECT count(*)
FROM
orders
WHERE
totalamount > 100 AND totalamount <= 200) AS ">100 <= 200"
...
FROM
orders;
EDIT Andomar's answer also allowed me to find the following approach (adapted from an example in SQL in a nutshell (O'Reilly)). This lets me have the buckets in one column, with a row for each bucket/answer pairing. I thought I'd include it for anyone with that use-case:
SELECT CASE
WHEN totalamount IS NULL THEN 'Unknown'
WHEN totalamount <= 100 THEN 'Not more than 100'
WHEN totalamount <= 200 THEN 'Not more than 200'
ELSE 'Over 200'
END "Bucket",
COUNT(*) "Number of results"
FROM
orders
GROUP BY CASE
WHEN totalamount IS NULL THEN 'Unknown'
WHEN totalamount <= 100 THEN 'Not more than 100'
WHEN totalamount <= 200 THEN 'Not more than 200'
ELSE 'Over 200'
END
ORDER BY
MIN(totalamount);
Upvotes: 1
Views: 2282
Reputation: 238136
You're selecting every row from orders
, and then for each row, the subqueries are evaluated.
Consider this approach instead:
select count(case when 0 < totalamount and totalamount <= 100 then 1 end)
as "<0,100]"
, count(case when 100 < totalamount and totalamount <= 200 then 1 end)
as "<100,200]"
from Orders
This would calculate both aggregates in a single table scan.
Upvotes: 6
Reputation: 2290
Because you included them as a subquery. From the looks of it, both the count subqueries are executed for each row in the orders
table. If you omit the from
piece from the main query you should only get one row.
For example, the query
SELECT 'John Doe' FROM orders
returns 'John Doe' for each row in the orders table.
SELECT 'John Doe'
only selects one row.
Disclaimer: this is sql server behaviour.
Upvotes: 3