Reputation: 365
I am trying to write a query that pulls the top 5 customers by the amount of all products they have purchased. No problem. But in addition, I need to find the product that was the most popular for each of these 5 customers.
I think I can do this by looking at what product they bought the most (so what productID's quantity is the max quantity), but I'm not quite sure how to get at the productID. Any ideas? Am I going about this all wrong? Thanks!
select top 5 c.CustomerID, sum(sod.orderqty) AS 'Amount Purchased',
max(sod.orderqty) AS 'Most Purchased'
from Sales.Customer c
inner join Sales.SalesOrderHeader soh on soh.CustomerID = c.CustomerID
inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
inner join Production.Product p on p.ProductID = sod.ProductID
group by c.CustomerID
order by 'Amount Purchased' desc
I tried the query below, but for some reason, including the p.name in the group by completely throws off the max. For example, in this query, the 'Most Purchased' should be 20, but it shows as 18 (which is the 2nd-next max.)
select top 5 soh.CustomerID, sum(sod.orderqty) AS 'Amount Purchased'
, max(sod.orderqty) AS 'Most Purchased', p.name
from Sales.SalesOrderHeader soh
inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
inner join Production.Product p on p.ProductID = sod.ProductID
where soh.CustomerID = 29705
group by soh.CustomerID, p.name
order by 'Amount Purchased' desc
Upvotes: 2
Views: 2789
Reputation: 32703
We can do it in two stages. I'm using SQL Server 2008.
At first find the top 5 customers as you did:
WITH
CTE_TopCustomers
AS
(
select top (5)
c.CustomerID
, sum(sod.orderqty) AS 'Amount Purchased'
from
Sales.Customer c
inner join Sales.SalesOrderHeader soh on soh.CustomerID = c.CustomerID
inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
inner join Production.Product p on p.ProductID = sod.ProductID
group by
c.CustomerID
order by 'Amount Purchased' desc
)
Then for each of these customers find the most popular product. See the sub-query in CROSS APPLY
. The "most popular" here means the product that the customer bought most in total. Say, if customer bought 10 units of some product with ID=1
on one day and 20 units of the same product with ID=1
on the next day, the sum would be 30. If the same customer bought 25 units of another product with ID=2
all on one day, then the most popular product for that customer would be the one with ID=1
and 30 total units.
If you want to choose product with ID=2
as the most popular in this example, change the SUM
to MAX
inside the CROSS APPLY
.
SELECT
CTE_TopCustomers.CustomerID
,CTE_TopCustomers.[Amount Purchased]
,CTE_Products.ProductID
,CTE_Products.ProductName
,CTE_Products.SumCustomerProductQty
FROM
CTE_TopCustomers
CROSS APPLY
(
SELECT TOP (1)
p.ProductID
,p.name AS ProductName
,SUM(sod.orderqty) AS SumCustomerProductQty
FROM
Sales.SalesOrderHeader soh on soh.CustomerID = CTE_TopCustomers.CustomerID
inner join Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID
inner join Production.Product p on p.ProductID = sod.ProductID
GROUP BY
p.ProductID
,p.name
ORDER BY SumCustomerProductQty DESC
) AS CTE_Products
ORDER BY [Amount Purchased] DESC;
For the final query just put both code blocks together.
Upvotes: 1
Reputation: 1271151
Do an intermediate aggregation by product:
select top 5 CustomerId, sum(AmountPurchased) as AmountPurchased,
max(AmountPurchased) as maxAmountPurchased,
max(case when seqnum = 1 then productId end) as MostPurchased
from (select c.CustomerID, p.ProductId, sum(sod.orderqty) AS AmountPurchased,
row_number() over (partition by c.CustomerId order by sum(sod.orderqty) desc) as seqnum
from Sales.Customer c inner join
Sales.SalesOrderHeader soh
on soh.CustomerID = c.CustomerID inner join
Sales.SalesOrderDetail sod
on sod.SalesOrderID = soh.SalesOrderID inner join
Production.Product p
on p.ProductID = sod.ProductID
group by c.CustomerID, ProductId
) cp
group by CustomerId
order by AmountPurchased desc;
Note that this is slightly different from your query with regards to the maximum amount. This version provides the maximum at the total product level. You can get your version by using max()
in the subquery, but this seems like what you might really want.
Upvotes: 0