Reputation: 69
I have a table (orders) in my SQL db and want to know which article (articleID) was sold most. I'm currently using the the following statement:
SELECT articleID FROM orders
GROUP BY articleID
ORDER BY COUNT(articleID) DESC LIMIT 1;
It works but I was wondering if there is any better solution for that. Couldn't find anything similar in previous posts - if you do, please point me in the right direction.
Upvotes: 2
Views: 85
Reputation: 164939
As long as articleid
is indexed, presumably it's a foreign key so it should already be, your query is fine. An explain analyze
shows what's going on. You didn't mention which database you're using, so this is Postgres.
test=> select count(*) from orders;
count
---------
3003116
test=> select count(distinct articleid) from orders;
count
---------
1999922
test=> explain analyze SELECT articleID FROM orders
GROUP BY articleID
ORDER BY COUNT(articleID) DESC LIMIT 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=149534.48..149534.48 rows=1 width=12) (actual time=1446.766..1446.767 rows=1 loops=1)
-> Sort (cost=149534.48..156236.51 rows=2680814 width=12) (actual time=1446.765..1446.765 rows=1 loops=1)
Sort Key: (count(articleid)) DESC
Sort Method: top-N heapsort Memory: 25kB
-> GroupAggregate (cost=0.43..136130.41 rows=2680814 width=12) (actual time=3.772..1245.538 rows=1999922 loops=1)
Group Key: articleid
-> Index Only Scan using orders_article_id on orders (cost=0.43..94306.81 rows=3003092 width=4) (actual time=1.307..648.542 rows=3003116 loops=1)
Heap Fetches: 3003116
Planning time: 4.692 ms
Execution time: 1446.809 ms
Most of the cost is in scanning the index and aggregating the result. The order by
is very cheap.
The cost increases linearly with your number of orders. Double the orders, double the cost. If this is not acceptable you can instead make a separate table for order statistics and update it either periodically or with triggers.
Upvotes: 2
Reputation: 161
You can try this, I rid off ORDER BY, it's slow operation.
WITH T AS(
SELECT COUNT(*) AS C1, articleID FROM orders
GROUP BY articleID)
SELECT articleID FROM T
WHERE C1 = (SELECT MAX(C1) FROM T)
Upvotes: -1