JavaSheriff
JavaSheriff

Reputation: 7665

generate report from columns stored in table

I would like to generate the report below based on the data and category table:

category table hold the category and pointer to the field on data table:

CAT     FIELD
PRINTER     P1
CHAIR       P3
TABLE       P2

data table holds the data and phisical fields:

ITEM_ID P1  P2  P3  P4
1       A   B   C   D
2       X   Y   Z   A 
3       N   M   O   P

this is how the report should look like:

ITEM_ID     CAT
1       PRINTER_A
1       CHAIR_C
1       TABLE_B
2       PRINTER_X
2       CHAIR_Z
2       TABLE_Y
3       PRINTER_N
3       CHAIR_O
3       TABLE_M

for a solution I can fetch all ITEMS in DATA table then loop on each CATEGORY and do an insert, but there are millions of items in DATA table and 20+ items in the category table, it will have bad performance.

Any idea how to generate this efficiently?

Source:

CREATE TABLE [dbo].[CAT_REPORT](
    [ITEM_ID] [nchar](100) NULL,
    [CAT] [nchar](100) NULL
) 
GO

CREATE TABLE [dbo].[DATA](
[ITEM_ID] [nchar](10) NULL,
[P1] [nchar](50) NULL,
[P2] [nchar](50) NULL,
[P3] [nchar](50) NULL,
[P4] [nchar](50) NULL
)

CREATE TABLE [dbo].[CATEGORY](
    [CAT] [nchar](10) NULL,
    [FIELD] [nchar](10) NULL
)


INSERT [dbo].[CATEGORY] ([CAT], [FIELD]) VALUES ('PRINTER', 'P1')
GO
INSERT .[CATEGORY] ([CAT], [FIELD]) VALUES ('CHAIR', 'P3')
GO
INSERT .[CATEGORY] ([CAT], [FIELD]) VALUES ('TABLE', 'P2')
GO
INSERT .[DATA] ([ITEM_ID], [P1], [P2], [P3], [P4]) VALUES ('1', 'A', 'B', 'C', 'D')
GO
INSERT .[DATA] ([ITEM_ID], [P1], [P2], [P3], [P4]) VALUES ('2', 'X', 'Y', 'Z', 'A')
GO
INSERT .[DATA] ([ITEM_ID], [P1], [P2], [P3], [P4]) VALUES ('3', 'N', 'M', 'O', 'P')
GO

Here Is my Stored I got so far:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
DROP procedure fill_category_report_table
go
CREATE PROCEDURE fill_category_report_table
AS
BEGIN

    SET NOCOUNT ON;
    DECLARE @CatName nvarchar(20),@CatField nvarchar(20),@ItemId nvarchar(20),@CatNameOut nvarchar(20),@out_var nvarchar(20)    
    DECLARE @sql nvarchar(100)


    DECLARE DATA_CUR CURSOR 
    LOCAL STATIC READ_ONLY FORWARD_ONLY
        FOR 
    SELECT  ITEM_ID from  [DATA]

    OPEN DATA_CUR
    FETCH NEXT FROM DATA_CUR INTO @ItemId
    WHILE @@FETCH_STATUS = 0
    BEGIN 
        PRINT @ItemId

                DECLARE CAT_CUR CURSOR 
                LOCAL STATIC READ_ONLY FORWARD_ONLY
                    FOR 
                SELECT  CAT,FIELD from  CATEGORY
                OPEN CAT_CUR
                FETCH NEXT FROM CAT_CUR INTO @CatName,@CatField
                WHILE @@FETCH_STATUS = 0
                BEGIN 
                    PRINT @CatName                  
                    SET @sql = N'SELECT @CatNameOut=@CatName + ''_'' + ' + @CatField + ' FROM [DATA] where ITEM_ID=' +@ItemId 
                    EXECUTE  sp_executesql @sql,N'@CatName  varchar(100), @CatNameOut varchar(100) OUTPUT',@CatNameOut = @CatNameOut output,@CatName=@CatName;

                    INSERT INTO CAT_REPORT ([ITEM_ID],[CAT])    VALUES   (@ItemId  ,@CatNameOut)
                    FETCH NEXT FROM CAT_CUR INTO  @CatName,@CatField
                END
                CLOSE CAT_CUR
                DEALLOCATE CAT_CUR
        FETCH NEXT FROM DATA_CUR INTO @ItemId
    END
    CLOSE DATA_CUR
    DEALLOCATE DATA_CUR
END
GO

Upvotes: 1

Views: 122

Answers (2)

pacreely
pacreely

Reputation: 1931

You definitely want to a void a RBAR (Row By Agonising Row) Cursor.

Your code is performing an unpivot of data, in SQL Server the UNPIVOT function is not very good, a better way to unpivot data is to use CROSS APPLY with VALUES

Based on Comments it sounds like you don't need a Dynamic Query. In SQL, Dynamic queries tend to be used to handle queries that might need to dynamically access different Tables or Columns. All that seems to change are the Records in the Category Table but they will always refer to columns P1 - P4 where relevant.

Your Table Structure appears to be Static -only the data changes- if so then Query 1 will do.

--Query 1    
    INSERT INTO dbo.[CAT_REPORT]
        SELECT
            D.ITEM_ID
            ,RTRIM(C.CAT) + '_' + cl.Val AS [CAT]
        FROM    
            dbo.[DATA] D
                CROSS APPLY (VALUES ('P1',D.P1),('P3',D.P3),('P2',D.P2)) AS cl(Field,Val) 
                INNER JOIN dbo.[CATEGORY] C
                    ON cl.Field = C.FIELD

If the Structure of the Data Table can change, then Query 2 would be relevant.

--Query 2    
--Dynamically build Field list from CATEGORY table.
    DECLARE @Fields NVARCHAR(200) = (
    SELECT TOP 1
        STUFF((SELECT ',(''' + RTRIM(FIELD) + ''',D.' + RTRIM(FIELD) + ')' FROM [dbo].[CATEGORY] FOR XML PATH ('')),1,1,'') FIELD
    FROM [dbo].[CATEGORY]
    WHERE
        --Ensure the column name exists in DATA
        Field IN (SELECT name from sys.columns WHERE object_id = object_id('Data'))
    )

    --Build query to look at Fields - use OPITON(RECOMPILE) to prevent Parameter Sniffing
    DECLARE @QRY NVARCHAR(2000) = '
    INSERT INTO dbo.[CAT_REPORT]
SELECT
        D.ITEM_ID
        ,RTRIM(C.CAT) + ''_'' + cl.Val AS [CAT]
    FROM    
        [dbo].[DATA] D
            CROSS APPLY (VALUES ' + @Fields + ') AS cl(Field,Val) 
            INNER JOIN [dbo].[CATEGORY] C
                ON cl.Field = C.FIELD
    OPTION (RECOMPILE)'

    --Run Query with Dynamically identified Fields
    EXEC (@QRY)

SELECT * FROM  dbo.[CAT_REPORT]

Upvotes: 1

Laughing Vergil
Laughing Vergil

Reputation: 3756

Based on your sample data, this performs the query without resorting to a cursor - which will ABSOLUTELY RUIN your performance.

SELECT 
    D.ITEM_ID,
    RTRIM(Cast(C.CAT as nVarChar)) + '_' + 
    CASE C.FIELD
        WHEN 'P1' 
            THEN d.P1
        WHEN 'P2' 
            THEN d.P2
        WHEN 'P3' 
            THEN d.P3
        WHEN 'P4' 
            THEN d.P4
        ELSE NULL
    END as Cat
FROM Data D
CROSS JOIN Category C
ORDER BY ITEM_ID, FIELD

By the way, you really shouldn't store data in a char/nchar field without a really good reason. These fields use the full amount of data space, even if they are only storing one character. Varchar/nVarchar is a much more compact way of storing data unless you absolutely need each of the values stored in a field to be the same length.

Upvotes: 2

Related Questions