Reputation: 3
I am trying to list customers who have purchased from ProductCategoryKey 1, and if those customers have purchased ProductCategoryKEY 3 more than once show only one row for that customer. My code so far only filters through PC Key 1. How can I say 'of those who have purchased PC Key 1, how many have purchased PC Key 3 at least two times and only show that person once?'
SELECT DISTINCT C.FirstName, C.LastName, PC.EnglishProductCategoryName
FROM dbo.FactInternetSales AS I
INNER JOIN dbo.DimCustomer AS C
ON I.CustomerKey = C.CustomerKey
INNER JOIN dbo.DimProduct AS P
ON I.ProductKey = P.ProductKey
INNER JOIN dbo.DimProductSubcategory AS SC
ON P.ProductSubcategoryKey = SC.ProductSubcategoryKey
INNER JOIN dbo.DimProductCategory as PC
ON SC.ProductCategoryKey = PC.ProductCategoryKey
WHERE PC.ProductCategoryKey = 1
ORDER BY C.LastName ASC, FirstName ASC;
Using Microsoft SQL Server 2019 ADventureWorksDW2017
Upvotes: 0
Views: 1256
Reputation: 27202
I would use a CTE over a temp table unless you have a performance issue. This way you get a single execution plan which is often better (except when its not) than splitting into two execution plans (as using a temp table does).
Also you shouldn't need a distinct
unless you have your logic wrong. A correctly grouped query very rarely needs a distinct
.
WITH cte AS (
SELECT C.FirstName, C.LastName, PC.EnglishProductCategoryName, PC.ProductCategoryKey
FROM dbo.FactInternetSales AS I
INNER JOIN dbo.DimCustomer AS C ON I.CustomerKey = C.CustomerKey
INNER JOIN dbo.DimProduct AS P ON I.ProductKey = P.ProductKey
INNER JOIN dbo.DimProductSubcategory AS SC ON P.ProductSubcategoryKey = SC.ProductSubcategoryKey
INNER JOIN dbo.DimProductCategory AS PC ON SC.ProductCategoryKey = PC.ProductCategoryKey
WHERE PC.ProductCategoryKey = 1 AND PC.ProductCategoryKey = 3
)
SELECT FirstName, LastName, EnglishProductCategoryName, COUNT(*)
FROM cte
WHERE ProductCategoryKey = 3
GROUP BY FirstName, LastName, EnglishProductCategoryName
HAVING COUNT(*) > 2
ORDER BY LastName ASC, FirstName ASC;
I am assuming the logic is correct, because without sample data and expected results its hard to know.
Upvotes: 0
Reputation: 186
I'm a big fan of using temp tables, or variable tables, as a way to further filter data. The first query gets all customers with a PC Key of 1 and 3, and inserts those records into a temp table. Then, the second query uses aggregation and a GROUP BY
clause to count all rows where the PC Key is 3, followed by a HAVING
clause where the count is > 2. The GROUP BY
clause in the second query effectively creates a distinct record-set for your last requirement.
I break up the query into two queries for readability. You could also use nested sub queries to achieve the same result.
SELECT DISTINCT C.FirstName, C.LastName, PC.EnglishProductCategoryName,
PC.ProductCategoryKey
into #temp1
FROM dbo.FactInternetSales AS I
INNER JOIN dbo.DimCustomer AS C
ON I.CustomerKey = C.CustomerKey
INNER JOIN dbo.DimProduct AS P
ON I.ProductKey = P.ProductKey
INNER JOIN dbo.DimProductSubcategory AS SC
ON P.ProductSubcategoryKey = SC.ProductSubcategoryKey
INNER JOIN dbo.DimProductCategory as PC
ON SC.ProductCategoryKey = PC.ProductCategoryKey
WHERE PC.ProductCategoryKey = 1
AND PC.ProductCategoryKey = 3
ORDER BY C.LastName ASC, FirstName ASC
SELECT FirstName, LastName, EnglishProductCategoryName, COUNT(*)
FROM #temp1
WHERE ProductCategoryKey = 3
GROUP BY FirstName, LastName, EnglishProductCategoryName
HAVING COUNT(*) > 2
Upvotes: 1