Reputation: 4479
I have a very complex query that I'm working with and one part of it has to retrieve a category associated with a product. The categories are stored recursively in a Category
table. The product-category mapping is in the ProductCategory
table (technically, a single product can have multiple categories, but for now let's leave that off the table, unless it is a simple variable to account for).
The Category
table is very simple. One column is CategoryID
, another is ParentCategoryID
, and a third is a Name
column. From this, categories are nested. The ProductCategory
table is also simple. One column is ProductID
another is CategoryID
.
I need to retrieve the top-most and second-top-most categories for any given product. I then use this information in a report with some analytics. My solution is really slow and doesn't scale very well. I can't figure out how I might more efficiently extract the data that I need.
What my solution attempts to do is collect together all the categories that are parents of a particular products assigned category and then grab the last two I find and return those. I've done this as a scalar function where I send the current CategoryID
and the level I want back, so 0 for one call and 1 for another call.
My sample code:
WITH Categories AS (
SELECT DISTINCT
CategoryID
FROM
ProductCategory
), CategoriesAtDepth AS (
SELECT
Categories.CategoryID
, dbo.WR_f_GetCategoryIDAtDepth(Categories.CategoryID, 0) AS TopCategory
, dbo.WR_f_GetCategoryIDAtDepth(Categories.CategoryID, 1) AS SecondCategory
FROM
Categories
)
SELECT
CategoriesAtDepth.CategoryID
, c1.Name AS TopCategory
, c2.Name AS SecondCategory
FROM
CategoriesAtDepth LEFT JOIN
Category AS c1 ON CategoriesAtDepth.TopCategory = c1.CategoryID LEFT JOIN
Category AS c2 ON CategoriesAtDepth.SecondCategory = c2.CategoryID
And the function code:
CREATE FUNCTION WR_f_GetCategoryIDAtDepth
(
@CategoryID AS int
,@Depth AS int = 0
)
RETURNS int
AS
BEGIN
-- Declare the return variable here
DECLARE @Result int
DECLARE @CurrentHeight int = 0
DECLARE @CurrentCategoryID int = @CategoryID
DECLARE @CategoryLevels table
(
Height int
,CategoryID int
)
BEGIN
--Populate a table with all the categoy IDs in the chain
WHILE @CurrentCategoryID > 0
BEGIN
INSERT INTO @CategoryLevels (Height, CategoryID) VALUES (@CurrentHeight + 1, @CurrentCategoryID)
SET @CurrentCategoryID = (SELECT ParentCategoryID FROM Category WHERE CategoryID = ISNULL((SELECT CategoryID FROM @CategoryLevels WHERE Height = @CurrentHeight + 1), 0))
SET @CurrentHeight = @CurrentHeight + 1
END
SET @Result = (SELECT CategoryID FROM @CategoryLevels WHERE Height = (@CurrentHeight - @Depth))
END
-- Return the result of the function
RETURN @Result
END
GO
I thought more about the comment about using recursive CTE's by @George Mavritsakis and decided to try to implement it in the function and came up with this much faster solution:
CREATE FUNCTION WR_f_GetCategoryIDAtDepth
(
@CategoryID AS int
,@Depth AS int = 0
)
RETURNS int
AS
BEGIN
-- Declare the return variable here
DECLARE @Result int
DECLARE @CategoryLevels table
(
Height int
,CategoryID int
)
BEGIN
--Populate a table with all the categoy IDs in the chain
WITH Base AS (
SELECT
0 AS Height
, @CategoryID AS CategoryID
UNION ALL
SELECT
Height + 1
, ParentCategoryID
FROM
Category INNER JOIN
Base ON Category.CategoryID = Base.CategoryID
)
INSERT INTO @CategoryLevels (Height, CategoryID)
SELECT * FROM Base
SET @Result = (SELECT CategoryID FROM @CategoryLevels WHERE Height = ((SELECT MAX(Height) FROM @CategoryLevels) - @Depth - 1))
END
-- Return the result of the function
RETURN @Result
END
GO
Upvotes: 1
Views: 1088
Reputation: 7083
You MUST look into recursive CTEs: http://technet.microsoft.com/en-us/library/ms186243%28v=sql.105%29.aspx
Your solution is slow because you are quering many times your Category table with the function WR_f_GetCategoryIDAtDepth.
Upvotes: 1