Sebastian Kramer
Sebastian Kramer

Reputation: 15

Assign ID's to XML hierarchy using SQL/XQuery

I would like to extract ID's from a XML document using the hierarchy given by the XML. The document has no ID's in it, they have to be created using the structure of the document.

This is the XML document:

<GL_MarketDocument xmlns="urn:iec62325.351:tc57wg16:451-6:generationloaddocument:3:0">
    <TimeSeries>
        <MktPSRType>
            <psrType>B11</psrType>
        </MktPSRType>
        <MktPSRType>
            <psrType>B12</psrType>
        </MktPSRType>
    </TimeSeries>
    <TimeSeries>
        <MktPSRType>
            <psrType>B11</psrType>
        </MktPSRType>
    </TimeSeries>
</GL_MarketDocument>

I use the follwing code to create ID's which represent the hierarchy of the XML. My idea was to use a row numbering, but I don't get it working.

DECLARE @XMLDocument XML =
N'<GL_MarketDocument xmlns="urn:iec62325.351:tc57wg16:451-6:generationloaddocument:3:0">
    <TimeSeries>
        <MktPSRType>
            <psrType>B11</psrType>
        </MktPSRType>
        <MktPSRType>
            <psrType>B12</psrType>
        </MktPSRType>
    </TimeSeries>
    <TimeSeries>
        <MktPSRType>
            <psrType>B11</psrType>
        </MktPSRType>
    </TimeSeries>
</GL_MarketDocument>';

WITH XMLNAMESPACES (DEFAULT 'urn:iec62325.351:tc57wg16:451-6:generationloaddocument:3:0')
SELECT  row_number() over (partition by T.N order by T.N) as TimeSeriesID,
        row_number() over(partition by T.N order by T1.N) as MktpsrtID,
        T2.N.value('(*:psrType/text())[1]', 'nvarchar(5)') as [psrType]
FROM @XMLDocument.nodes('/GL_MarketDocument') AS T(N)
outer apply T.N.nodes('*:TimeSeries') as T1(N)
outer apply T1.N.nodes('*:MktPSRType') as T2(N)

The result is:

TimeSeriesID MktpsrtID psrType
1 1 B11
2 2 B12
3 3 B11

What I would like to see is the following numbering, representing the node structure of the XML.

TimeSeriesID MktpsrtID psrType
1 1 B11
1 2 B12
2 1 B11

The use of row_number() is less important, I am searching for the most efficient way to solve the problem.

Upvotes: 0

Views: 50

Answers (1)

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22187

Please try the following solution.

When there is a default namespace defined - no need to use namespace wildcard (*).

The TimeSeriesID column value is calculated by using XQuery FLWOR expression and XQuery Node Comparison operator.

SQL

DECLARE @XMLDocument XML =
N'<GL_MarketDocument xmlns="urn:iec62325.351:tc57wg16:451-6:generationloaddocument:3:0">
    <TimeSeries>
        <MktPSRType>
            <psrType>B11</psrType>
        </MktPSRType>
        <MktPSRType>
            <psrType>B12</psrType>
        </MktPSRType>
    </TimeSeries>
    <TimeSeries>
        <MktPSRType>
            <psrType>B21</psrType>
        </MktPSRType>
    </TimeSeries>
</GL_MarketDocument>';

WITH XMLNAMESPACES (DEFAULT 'urn:iec62325.351:tc57wg16:451-6:generationloaddocument:3:0')
SELECT c1.value('for $i in . return count(/GL_MarketDocument/TimeSeries[. << $i]) + 1', 'INT') AS TimeSeriesID
    , DENSE_RANK() OVER(PARTITION BY c1 ORDER BY c2) as MktpsrtID
    , c2.value('(psrType/text())[1]', 'nvarchar(5)') as [psrType]
FROM @XMLDocument.nodes('/GL_MarketDocument/TimeSeries') AS t1(c1)
    OUTER APPLY t1.c1.nodes('MktPSRType') as t2(c2);

Output

+--------------+-----------+---------+
| TimeSeriesID | MktpsrtID | psrType |
+--------------+-----------+---------+
|            1 |         1 | B11     |
|            1 |         2 | B12     |
|            2 |         1 | B21     |
+--------------+-----------+---------+

Upvotes: 2

Related Questions