Michelle
Michelle

Reputation: 365

Trying to Find Most Popular Product

I am trying to write a query that pulls the top 5 customers by the amount of all products they have purchased. No problem. But in addition, I need to find the product that was the most popular for each of these 5 customers.

I think I can do this by looking at what product they bought the most (so what productID's quantity is the max quantity), but I'm not quite sure how to get at the productID. Any ideas? Am I going about this all wrong? Thanks!

select top 5 c.CustomerID, sum(sod.orderqty) AS 'Amount Purchased', 
max(sod.orderqty) AS 'Most Purchased'   
from Sales.Customer c
inner join Sales.SalesOrderHeader soh on soh.CustomerID = c.CustomerID
inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
inner join Production.Product p on p.ProductID = sod.ProductID
group by c.CustomerID
order by 'Amount Purchased' desc

I tried the query below, but for some reason, including the p.name in the group by completely throws off the max. For example, in this query, the 'Most Purchased' should be 20, but it shows as 18 (which is the 2nd-next max.)

select top 5 soh.CustomerID, sum(sod.orderqty) AS 'Amount Purchased'
, max(sod.orderqty) AS 'Most Purchased', p.name 
from Sales.SalesOrderHeader soh 
inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
inner join Production.Product p on p.ProductID = sod.ProductID
where soh.CustomerID = 29705
group by soh.CustomerID, p.name
order by 'Amount Purchased' desc

Upvotes: 2

Views: 2789

Answers (2)

Vladimir Baranov
Vladimir Baranov

Reputation: 32703

We can do it in two stages. I'm using SQL Server 2008.

At first find the top 5 customers as you did:

WITH
CTE_TopCustomers
AS
(
    select top (5)
        c.CustomerID
        , sum(sod.orderqty) AS 'Amount Purchased'
    from
        Sales.Customer c
        inner join Sales.SalesOrderHeader soh on soh.CustomerID = c.CustomerID
        inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
        inner join Production.Product p on p.ProductID = sod.ProductID
    group by
        c.CustomerID
    order by 'Amount Purchased' desc
)

Then for each of these customers find the most popular product. See the sub-query in CROSS APPLY. The "most popular" here means the product that the customer bought most in total. Say, if customer bought 10 units of some product with ID=1 on one day and 20 units of the same product with ID=1 on the next day, the sum would be 30. If the same customer bought 25 units of another product with ID=2 all on one day, then the most popular product for that customer would be the one with ID=1 and 30 total units.

If you want to choose product with ID=2 as the most popular in this example, change the SUM to MAX inside the CROSS APPLY.

SELECT
    CTE_TopCustomers.CustomerID
    ,CTE_TopCustomers.[Amount Purchased]
    ,CTE_Products.ProductID
    ,CTE_Products.ProductName
    ,CTE_Products.SumCustomerProductQty
FROM
    CTE_TopCustomers
    CROSS APPLY
    (
        SELECT TOP (1)
            p.ProductID
            ,p.name AS ProductName
            ,SUM(sod.orderqty) AS SumCustomerProductQty
        FROM
            Sales.SalesOrderHeader soh on soh.CustomerID = CTE_TopCustomers.CustomerID
            inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
            inner join Production.Product p on p.ProductID = sod.ProductID
        GROUP BY
            p.ProductID
            ,p.name
        ORDER BY SumCustomerProductQty DESC
    ) AS CTE_Products
ORDER BY [Amount Purchased] DESC;

For the final query just put both code blocks together.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271151

Do an intermediate aggregation by product:

select top 5 CustomerId, sum(AmountPurchased) as AmountPurchased,
       max(AmountPurchased) as maxAmountPurchased,
       max(case when seqnum = 1 then productId end) as MostPurchased
from (select c.CustomerID, p.ProductId, sum(sod.orderqty) AS AmountPurchased, 
             row_number() over (partition by c.CustomerId order by sum(sod.orderqty) desc) as seqnum
      from Sales.Customer c inner join
           Sales.SalesOrderHeader soh
           on soh.CustomerID = c.CustomerID inner join
           Sales.SalesOrderDetail sod
           on sod.SalesOrderID = soh.SalesOrderID inner join
           Production.Product p
           on p.ProductID = sod.ProductID
      group by c.CustomerID, ProductId
     ) cp
group by CustomerId
order by AmountPurchased desc;

Note that this is slightly different from your query with regards to the maximum amount. This version provides the maximum at the total product level. You can get your version by using max() in the subquery, but this seems like what you might really want.

Upvotes: 0

Related Questions