Jeff Corn
Jeff Corn

Reputation: 1

XML SQL Optimization needed

Hi I been trying to insert into two tables (groups and fields) from XML in SQL. But the solution either doesn't fix my problem or performance is slow as Groups and Fields can number in hundreds of thousands.

A sample of the XML:

<?xml version="1.0" encoding="utf-16"?>
<FB_Flow
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema" id="1">
    <groups>
        <FB_FlowGroup counter="1125" position="2" positionparent="0" id="0">
            <fields>
                <FB_FlowField>
                    <value>TEST1</value>
                    <counter>111</counter>
                    <lineposition>1</lineposition>
                </FB_FlowField>
                <FB_FlowField>
                    <value>TEST2</value>
                    <counter>222</counter>
                    <lineposition>2</lineposition>
                    <groupid>0</groupid>
                </FB_FlowField>
                <FB_FlowField>
                    <value>TEST3</value>
                    <counter>333</counter>
                    <lineposition>3</lineposition>
                </FB_FlowField>
            </fields>
        </FB_FlowGroup>
        <FB_FlowGroup counter="1126" position="3" positionparent="2" id="0">
            <fields>
                <FB_FlowField>
                    <value>TEST1</value>
                    <counter>18</counter>
                    <lineposition>1</lineposition>
                </FB_FlowField>
            </fields>
        </FB_FlowGroup>     
    </groups>
</FB_Flow> 

The first part works fine (To get a list of all groups)

   insert into @Groups (intGroupCounter,intGroupPosition,intGroupPositionParent)
   SELECT
        gcounter = Groups.value('@counter[1]', 'int'),
        gposition = Groups.value('@position[1]', 'int'),
        gpositionparent = Groups.value('@positionparent[1]', 'int')
FROM
        @FlowXML.nodes('/FB_Flow/groups/FB_FlowGroup') AS XTbl(Groups)

This second part fails for the most part (To get all fields with the parent group position):

   insert into @Fields (intGroupPosition,vFieldValue,intFieldCounter,intFieldPosition)
       SELECT
        gposition = XTbl.Groups.value('@position', 'int'),
        fValue = XTbl2.Fields.value('value[1]', 'varchar(max)'),
        fcounter = XTbl2.Fields.value('counter[1]', 'int'),
        fposition = XTbl2.Fields.value('lineposition[1]', 'int')
FROM
        @FlowXML.nodes('/FB_Flow/groups/FB_FlowGroup') AS XTbl(Groups)
cross APPLY
    Groups.nodes('fields/FB_FlowField') AS XTbl2(Fields)

I have been getting around this by using a cursor and selecting the group by the position attribute but the performance is very poor.

   DECLARE @GroupCounter int,
            @GroupPosition int,
            @GroupPositionParent int,
            @GroupID int
            
   DECLARE @Groups table
   (
        intGroupCounter int not null,
        intGroupPosition int not null,
        intGroupPositionParent int null default 0
   )
            
   insert into @Groups (intGroupCounter,intGroupPosition,intGroupPositionParent)
   SELECT
        gcounter = Groups.value('@counter[1]', 'int'),
        gposition = Groups.value('@position[1]', 'int'),
        gpositionparent = Groups.value('@positionparent[1]', 'int')
FROM
        @FlowXML.nodes('/FB_Flow/groups/FB_FlowGroup') AS XTbl(Groups)

   DECLARE cur cursor for 
   SELECT
        intGroupCounter,
        intGroupPosition,
        intGroupPositionParent
    FROM
        @Groups

    OPEN cur

    FETCH NEXT FROM cur INTO @GroupCounter, @GroupPosition, @GroupPositionParent

    WHILE @@FETCH_STATUS = 0
    BEGIN
        insert into FB_T_FlowGroups (FH_ID,DTC_GroupCounter,Position,PositionParent)
        values (@FlowHeaderID,@GroupCounter,@GroupPosition,@GroupPositionParent)

        select @GroupID = @@IDENTITY 

        --declare @Path varchar(max) = '/FB_Flow/groups/FB_FlowGroup[@position="sql:variable("@GroupPosition")"]/fields/FB_FlowField' 
        
        insert into FB_T_FlowGroupField (FlowGroupID,ItemValue,DTC_ItemCounter)
        SELECT
            @GroupID,
            XTbl.Fields.value('value[1]', 'varchar(max)'),
            XTbl.Fields.value('counter[1]', 'int')
        FROM
            @FlowXML.nodes('/FB_Flow/groups/FB_FlowGroup[@position=sql:variable("@GroupPosition")]/fields/FB_FlowField') AS XTbl(Fields)
        

        FETCH NEXT FROM cur INTO @GroupCounter, @GroupPosition, @GroupPositionParent
    END
    CLOSE cur
    DEALLOCATE cur

Any Ideas?

Upvotes: 0

Views: 83

Answers (1)

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22182

What is your SQL Server version (SELECT @@VERSION;)?

Please try the following approach without a cursor. It should give you a tremendous performance improvement:

  • XML attributes don't need [1] position. Attributes are always unique.
  • XML elements need an adjustment in the XPath expression - text().

SQL

DECLARE @FlowXML XML =
N'<FB_Flow xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xmlns:xsd="http://www.w3.org/2001/XMLSchema" id="1">
    <groups>
        <FB_FlowGroup counter="1125" position="2" positionparent="0" id="0">
            <fields>
                <FB_FlowField>
                    <value>TEST1</value>
                    <counter>111</counter>
                    <lineposition>1</lineposition>
                </FB_FlowField>
                <FB_FlowField>
                    <value>TEST2</value>
                    <counter>222</counter>
                    <lineposition>2</lineposition>
                    <groupid>0</groupid>
                </FB_FlowField>
                <FB_FlowField>
                    <value>TEST3</value>
                    <counter>333</counter>
                    <lineposition>3</lineposition>
                </FB_FlowField>
            </fields>
        </FB_FlowGroup>
        <FB_FlowGroup counter="1126" position="3" positionparent="2" id="0">
            <fields>
                <FB_FlowField>
                    <value>TEST1</value>
                    <counter>18</counter>
                    <lineposition>1</lineposition>
                </FB_FlowField>
            </fields>
        </FB_FlowGroup>
    </groups>
</FB_Flow>';

-- insert into @Groups (intGroupCounter,intGroupPosition,intGroupPositionParent)
SELECT gcounter = Groups.value('@counter', 'INT')
    , gposition = Groups.value('@position', 'INT')
    , gpositionparent = Groups.value('@positionparent', 'INT')
FROM @FlowXML.nodes('/FB_Flow/groups/FB_FlowGroup') AS XTbl(Groups);

--insert into @Fields (intGroupPosition,vFieldValue,intFieldCounter,intFieldPosition)
SELECT gposition = XTbl.Groups.value('@position', 'INT')
    , fValue = XTbl2.Fields.value('(value/text())[1]', 'VARCHAR(MAX)')
    , fcounter = XTbl2.Fields.value('(counter/text())[1]', 'INT')
    , fposition = XTbl2.Fields.value('(lineposition/text())[1]', 'INT')
FROM @FlowXML.nodes('/FB_Flow/groups/FB_FlowGroup') AS XTbl(Groups)
    CROSS APPLY Groups.nodes('fields/FB_FlowField') AS XTbl2(Fields);

Upvotes: 1

Related Questions