cjbarth
cjbarth

Reputation: 4479

Retrieve specific level in recursive SQL query

I have a very complex query that I'm working with and one part of it has to retrieve a category associated with a product. The categories are stored recursively in a Category table. The product-category mapping is in the ProductCategory table (technically, a single product can have multiple categories, but for now let's leave that off the table, unless it is a simple variable to account for).

The Category table is very simple. One column is CategoryID, another is ParentCategoryID, and a third is a Name column. From this, categories are nested. The ProductCategory table is also simple. One column is ProductID another is CategoryID.

I need to retrieve the top-most and second-top-most categories for any given product. I then use this information in a report with some analytics. My solution is really slow and doesn't scale very well. I can't figure out how I might more efficiently extract the data that I need.

What my solution attempts to do is collect together all the categories that are parents of a particular products assigned category and then grab the last two I find and return those. I've done this as a scalar function where I send the current CategoryID and the level I want back, so 0 for one call and 1 for another call.

My sample code:

WITH Categories AS (
    SELECT DISTINCT
        CategoryID
    FROM
        ProductCategory
), CategoriesAtDepth AS (
    SELECT
        Categories.CategoryID
        , dbo.WR_f_GetCategoryIDAtDepth(Categories.CategoryID, 0) AS TopCategory
        , dbo.WR_f_GetCategoryIDAtDepth(Categories.CategoryID, 1) AS SecondCategory
    FROM
        Categories
)
SELECT
    CategoriesAtDepth.CategoryID
    , c1.Name AS TopCategory
    , c2.Name AS SecondCategory
FROM
    CategoriesAtDepth LEFT JOIN
    Category AS c1 ON CategoriesAtDepth.TopCategory = c1.CategoryID LEFT JOIN
    Category AS c2 ON CategoriesAtDepth.SecondCategory = c2.CategoryID

And the function code:

CREATE FUNCTION WR_f_GetCategoryIDAtDepth 
(
    @CategoryID AS int
    ,@Depth AS int = 0
)
RETURNS int
AS
BEGIN

    -- Declare the return variable here
    DECLARE @Result int
    DECLARE @CurrentHeight int = 0
    DECLARE @CurrentCategoryID int = @CategoryID
    DECLARE @CategoryLevels table
    (
        Height int
        ,CategoryID int
    )

    BEGIN
        --Populate a table with all the categoy IDs in the chain
        WHILE @CurrentCategoryID > 0
        BEGIN
            INSERT INTO @CategoryLevels (Height, CategoryID) VALUES (@CurrentHeight + 1, @CurrentCategoryID)
            SET @CurrentCategoryID = (SELECT ParentCategoryID FROM Category WHERE CategoryID = ISNULL((SELECT CategoryID FROM @CategoryLevels WHERE Height = @CurrentHeight + 1), 0))
            SET @CurrentHeight = @CurrentHeight + 1
        END
        SET @Result = (SELECT CategoryID FROM @CategoryLevels WHERE Height = (@CurrentHeight - @Depth))
    END

    -- Return the result of the function
    RETURN @Result

END
GO

I thought more about the comment about using recursive CTE's by @George Mavritsakis and decided to try to implement it in the function and came up with this much faster solution:

CREATE FUNCTION WR_f_GetCategoryIDAtDepth 
(
    @CategoryID AS int
    ,@Depth AS int = 0
)
RETURNS int
AS
BEGIN

    -- Declare the return variable here
    DECLARE @Result int
    DECLARE @CategoryLevels table
    (
        Height int
        ,CategoryID int
    )

    BEGIN
        --Populate a table with all the categoy IDs in the chain
        WITH Base AS (
            SELECT
                0 AS Height
                , @CategoryID AS CategoryID

            UNION ALL

            SELECT
                Height + 1
                , ParentCategoryID
            FROM
                Category INNER JOIN
                    Base ON Category.CategoryID = Base.CategoryID
        )
        INSERT INTO @CategoryLevels (Height, CategoryID)
            SELECT * FROM Base

        SET @Result = (SELECT CategoryID FROM @CategoryLevels WHERE Height = ((SELECT MAX(Height) FROM @CategoryLevels) - @Depth - 1))
    END

    -- Return the result of the function
    RETURN @Result

END
GO

Upvotes: 1

Views: 1088

Answers (1)

George Mavritsakis
George Mavritsakis

Reputation: 7083

You MUST look into recursive CTEs: http://technet.microsoft.com/en-us/library/ms186243%28v=sql.105%29.aspx

Your solution is slow because you are quering many times your Category table with the function WR_f_GetCategoryIDAtDepth.

Upvotes: 1

Related Questions