Max count of ID in table

Question

I have a table (orders) in my SQL db and want to know which article (articleID) was sold most. I'm currently using the the following statement:

SELECT articleID FROM orders 
GROUP BY articleID 
ORDER BY COUNT(articleID) DESC LIMIT 1;

It works but I was wondering if there is any better solution for that. Couldn't find anything similar in previous posts - if you do, please point me in the right direction.

Schwern · Accepted Answer

As long as articleid is indexed, presumably it's a foreign key so it should already be, your query is fine. An explain analyze shows what's going on. You didn't mention which database you're using, so this is Postgres.

test=> select count(*) from orders;
  count  
---------
3003116

test=> select count(distinct articleid) from orders;
  count  
---------
 1999922

test=> explain analyze SELECT articleID FROM orders 
GROUP BY articleID 
ORDER BY COUNT(articleID) DESC LIMIT 1;
                                                                            QUERY PLAN                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=149534.48..149534.48 rows=1 width=12) (actual time=1446.766..1446.767 rows=1 loops=1)
   ->  Sort  (cost=149534.48..156236.51 rows=2680814 width=12) (actual time=1446.765..1446.765 rows=1 loops=1)
         Sort Key: (count(articleid)) DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  GroupAggregate  (cost=0.43..136130.41 rows=2680814 width=12) (actual time=3.772..1245.538 rows=1999922 loops=1)
               Group Key: articleid
               ->  Index Only Scan using orders_article_id on orders  (cost=0.43..94306.81 rows=3003092 width=4) (actual time=1.307..648.542 rows=3003116 loops=1)
                     Heap Fetches: 3003116
 Planning time: 4.692 ms
 Execution time: 1446.809 ms

Most of the cost is in scanning the index and aggregating the result. The order by is very cheap.

The cost increases linearly with your number of orders. Double the orders, double the cost. If this is not acceptable you can instead make a separate table for order statistics and update it either periodically or with triggers.

Max count of ID in table

Answers (2)

Related Questions