Daniel Hitzel
Daniel Hitzel

Reputation: 1342

SQL - get the max value from a group by + count table

I am referring to this data base: http://www.w3schools.com/sql/trysql.asp?filename=trysql_select_all

I am trying to fulfill the following query: "Which supplier provides most products?"

I came to this query

select SupplierName, max(count)
from(
    select SupplierName, count(SupplierName) as count
    from 
        Products as p
        inner join 
        Suppliers as s
        on p.SupplierID = s.SupplierID
    group by SupplierName
)

But the result is only half way right.

  SupplierName          |   max(count)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
                        |
  Zaanse Snoepfabriek   |   5

The value 5 is right, but the SupplierName is just the last value of the inner SQL Query. I always though the max(count) statement automatically provides the correct SupplierName. Why isn't it like that?

If I change the query to this, everything is alright:

select SupplierName, max(count)
from(
    select SupplierName, count(SupplierName) as count
    from 
        Products as p
        inner join 
        Suppliers as s
        on p.SupplierID = s.SupplierID
    group by SupplierName
) t
where t.count = 
(
    select max(count)
    from(
        select SupplierName, count(SupplierName) as count
        from 
            Products as p
            inner join 
            Suppliers as s
            on p.SupplierID = s.SupplierID
        group by SupplierName
    ) 
)

But this way I create a whole new table just to get the max and then compare it to t.count. If the data base is 1,000,000+ records things get serious.

Is there a more elegant way to this problem:

Count something then get the max of it, and get the correlated ID/Name to it.

Upvotes: 0

Views: 1765

Answers (1)

Alex Szabo
Alex Szabo

Reputation: 3276

Can you try this? I think it is doing the right job, as you are querying the MAX(COUNT) from the table, and I suppose it returns the value with the highest number of counts.

select S.SupplierName, R.count
from Suppliers S
LEFT JOIN (
    select SupplierName, count(SupplierName) as count
    from 
        Products as p
        inner join 
        Suppliers as s
        on p.SupplierID = s.SupplierID
    group by SupplierName
) R
ON R.SupplierName = S.SupplierName
ORDER BY S.SupplierName

However, for your original question "Which product was bought most?", I'd use this query:

SELECT p.productname, od.ProductId, SUM(Quantity)    
FROM [OrderDetails] od
JOIN Products P
ON od.ProductId = P.productId
GROUP BY od.ProductId, P.Productname
ORDER BY SUM(Quantity) DESC

As suppliers supply material, and do not relate to the sold goods. The customers are the ones buying. The question however only focuses on what type of product was sold with the highest quantites - this can be found out from the Order details table and if you join it with the products table, you can name all products sold and the quantity they were sold in.

UPDATE

And for the supplier who provides the most products, I'd use this:

SELECT P.SupplierId, S.SupplierName, COUNT(P.SupplierID) 
FROM Products P
JOIN Suppliers S
ON S.SupplierId = P.SupplierId
GROUP BY P.SupplierId, S.SupplierName
ORDER BY COUNT(P.SupplierId) DESC
LIMIT 1

LIMIT 1 (Slightly different in MSSQL - SELECT TOP 1 ...) will ensure that you get the top result. However, if there are suppliers tied for the first place with the same number of supplied goods then you will have to store the maximum count in a separate variable, or query. As in the site hosted by W3SCHOOLS using variables are not the way to go, here's a query that can do the same work just fine:

SELECT S.SupplierName, t.TotalCount
FROM Suppliers S
JOIN
(   SELECT P.SupplierId, S.SupplierName, COUNT(P.SupplierID) as TotalCount
    FROM Products P
    JOIN Suppliers S
    ON S.SupplierId = P.SupplierId
    GROUP BY P.SupplierId, S.SupplierName) t
ON t.SupplierId = S.SupplierId
JOIN
(   SELECT SupplierId, SupplierName, MAX(TotalCount) as Maximum
    FROM 
    (SELECT P.SupplierId, S.SupplierName, COUNT(P.SupplierID) as TotalCount
    FROM Products P
    JOIN Suppliers S
    ON S.SupplierId = P.SupplierId
    GROUP BY P.SupplierId, S.SupplierName) t2
) t3    
ON t3.Maximum = t.TotalCount
WHERE t.TotalCount = t3.Maximum
ORDER BY t.TotalCount DESC

Which is basically the same as you have posted in your original question, so it's not much of a help... :\ If it would be in a "real life" example, you could introduce a variable that would help tidying the query up a bit. Basically you assign the result of the query that I've used in the second join above, and you just reference it in the where clause.

There's more on variables here: http://dev.mysql.com/doc/refman/5.0/en/user-variables.html

SET @max = (SELECT MAX(TotalCount) as Maximum
    FROM 
    (    SELECT P.SupplierId, S.SupplierName, COUNT(P.SupplierID) as TotalCount
        FROM Products P
        JOIN Suppliers S
        ON S.SupplierId = P.SupplierId
        GROUP BY P.SupplierId, S.SupplierName) t);

SELECT S.SupplierName, t.TotalCount
FROM Suppliers S
JOIN
(   SELECT P.SupplierId, S.SupplierName, COUNT(P.SupplierID) as TotalCount
    FROM Products P
    JOIN Suppliers S
    ON S.SupplierId = P.SupplierId
    GROUP BY P.SupplierId, S.SupplierName) t
ON t.SupplierId = S.SupplierId   
WHERE t.TotalCount = @max
ORDER BY t.TotalCount DESC

Upvotes: 1

Related Questions