user10826235
user10826235

Reputation: 69

Max count of ID in table

I have a table (orders) in my SQL db and want to know which article (articleID) was sold most. I'm currently using the the following statement:

SELECT articleID FROM orders 
GROUP BY articleID 
ORDER BY COUNT(articleID) DESC LIMIT 1;

It works but I was wondering if there is any better solution for that. Couldn't find anything similar in previous posts - if you do, please point me in the right direction.

Upvotes: 2

Views: 85

Answers (2)

Schwern
Schwern

Reputation: 164939

As long as articleid is indexed, presumably it's a foreign key so it should already be, your query is fine. An explain analyze shows what's going on. You didn't mention which database you're using, so this is Postgres.

test=> select count(*) from orders;
  count  
---------
3003116

test=> select count(distinct articleid) from orders;
  count  
---------
 1999922

test=> explain analyze SELECT articleID FROM orders 
GROUP BY articleID 
ORDER BY COUNT(articleID) DESC LIMIT 1;
                                                                            QUERY PLAN                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=149534.48..149534.48 rows=1 width=12) (actual time=1446.766..1446.767 rows=1 loops=1)
   ->  Sort  (cost=149534.48..156236.51 rows=2680814 width=12) (actual time=1446.765..1446.765 rows=1 loops=1)
         Sort Key: (count(articleid)) DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  GroupAggregate  (cost=0.43..136130.41 rows=2680814 width=12) (actual time=3.772..1245.538 rows=1999922 loops=1)
               Group Key: articleid
               ->  Index Only Scan using orders_article_id on orders  (cost=0.43..94306.81 rows=3003092 width=4) (actual time=1.307..648.542 rows=3003116 loops=1)
                     Heap Fetches: 3003116
 Planning time: 4.692 ms
 Execution time: 1446.809 ms

Most of the cost is in scanning the index and aggregating the result. The order by is very cheap.

The cost increases linearly with your number of orders. Double the orders, double the cost. If this is not acceptable you can instead make a separate table for order statistics and update it either periodically or with triggers.

Upvotes: 2

MladenB
MladenB

Reputation: 161

You can try this, I rid off ORDER BY, it's slow operation.

WITH T AS(
SELECT COUNT(*) AS C1, articleID FROM orders
GROUP BY articleID)
SELECT articleID FROM T
WHERE C1 = (SELECT MAX(C1) FROM T)

Upvotes: -1

Related Questions