rajsx
rajsx

Reputation: 61

How to access the next sibling of an xml tag in openxml sql

So i have an XML file which looks like below

declare @xml
xml= '<ENVELOPE>
    <BILLFIXED>
        <BILLDATE>29-Jun-2019</BILLDATE>
        <BILLREF>123</BILLREF>
        <BILLPARTY>ABC</BILLPARTY>
    </BILLFIXED>
    <BILLOP>200</BILLOP>
    <BILLCL>200</BILLCL>
    <BILLDUE>29-Jun-2019</BILLDUE>
    <BILLOVERDUE>1116</BILLOVERDUE>
    <BILLFIXED>
        <BILLDATE>30-Jun-2019</BILLDATE>
        <BILLREF>April To June -19</BILLREF>
        <BILLPARTY>efg</BILLPARTY>
    </BILLFIXED>
    <BILLOP>100</BILLOP>
    <BILLCL>100</BILLCL>
    <BILLDUE>30-Jun-2019</BILLDUE>
    <BILLOVERDUE>1115</BILLOVERDUE>
</ENVELOPE>

Im trying to read this using openxml

DECLARE  @hDoc AS INT, @SQL NVARCHAR (MAX) 

EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML 

select   BILLDATE, BILLREF, BILLPARTY, BILLOP,BILLCL,BILLDUE, BILLOVERDUE
from OPENXML(@hDoc, '//BILLFIXED') 
WITH  
( 
BillDate [varchar](50) 'BILLDATE', 
BIllREF [varchar](50) 'BILLREF', 
BILLPARTY [varchar](100) 'BILLPARTY' 
,BILLOP [varchar](100) 'BILLOP' 
 BILLCL[varchar](100) 'REFERENCE', 
 BILLDUE [varchar](100) 'BILLDUE', 
 BILLOVERDUE [varchar](100) 'BILLOVERDUE'
) 

It was easy to extract <BILLFIXED> tag but not able to access the siblings tags

<BILLCL>
<BILLDUE>
<BILLOVERDUE>

any help in accessing these tags Thanks

Upvotes: 1

Views: 244

Answers (3)

Charlieface
Charlieface

Reputation: 71927

I agree you should not use OPENXML, and instead use .nodes and .value.

Unfortunately, SQL Server does not allow the sibling:: axis in XQuery, which would have made this much easier.

You can do this purely using XQuery, by using the >> positional predicate.

SELECT
      c.value('(BILLDATE /text())[1]', 'VARCHAR(20)') AS BILLDATE
    , c.value('(BILLREF  /text())[1]', 'VARCHAR(20)') AS BILLREF
    , c.value('(BILLPARTY/text())[1]', 'VARCHAR(20)') AS BILLPARTY
    , c.value('let $i := . return (/ENVELOPE/BILLOP     [. >> $i]/text())[1]', 'INT') AS BILLOP
    , c.value('let $i := . return (/ENVELOPE/BILLCL     [. >> $i]/text())[1]', 'INT') AS BILLCL
    , c.value('let $i := . return (/ENVELOPE/BILLDUE    [. >> $i]/text())[1]', 'VARCHAR(20)') AS BILLDUE
    , c.value('let $i := . return (/ENVELOPE/BILLOVERDUE[. >> $i]/text())[1]', 'INT') AS BILLOVERDUE
FROM @xml.nodes('/ENVELOPE/BILLFIXED') AS t(c);

What this does is as follows:

  • Shred just the /ENVELOPE/BILLFIXED nodes.
  • For each of those, return the children as normal.
  • For each sibling, do the following steps:
    • Store the current node in $i.
    • Take the first child node matching the correct name, of the parent,...
    • ...where that node is positioned after $i, so [. >> $i]
    • Take the first child node's text [1]

db<>fiddle

If some of the nodes could be empty or missing, then you also need to check that the sibling node is before the next BILLFIXED node

SELECT
      c.value('(BILLDATE /text())[1]', 'VARCHAR(20)') AS BILLDATE
    , c.value('(BILLREF  /text())[1]', 'VARCHAR(20)') AS BILLREF
    , c.value('(BILLPARTY/text())[1]', 'VARCHAR(20)') AS BILLPARTY
    , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLOP     [. >> $i][. << $nxt]/text())[1]', 'INT') AS BILLOP
    , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLCL     [. >> $i][. << $nxt]/text())[1]', 'INT') AS BILLCL
    , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLDUE    [. >> $i][. << $nxt]/text())[1]', 'VARCHAR(20)') AS BILLDUE
    , c.value('let $i := . let $nxt := (/ENVELOPE/BILLFIXED[. >> $i])[1] return (/ENVELOPE/BILLOVERDUE[. >> $i][. << $nxt]/text())[1]', 'INT') AS BILLOVERDUE
FROM @xml.nodes('/ENVELOPE/BILLFIXED') AS t(c);

db<>fiddle


One final option is to transform the XML into a more normalized structure, and then query that

SELECT
      f.value('(BILLDATE   /text())[1]', 'VARCHAR(20)') AS BILLDATE
    , f.value('(BILLREF    /text())[1]', 'VARCHAR(20)') AS BILLREF
    , f.value('(BILLPARTY  /text())[1]', 'VARCHAR(20)') AS BILLPARTY
    , c.value('(BILLOP     /text())[1]', 'INT') AS BILLOP
    , c.value('(BILLCL     /text())[1]', 'INT') AS BILLCL
    , c.value('(BILLDUE    /text())[1]', 'VARCHAR(20)') AS BILLDUE
    , c.value('(BILLOVERDUE/text())[1]', 'INT') AS BILLOVERDUE
FROM (VALUES(
  @xml.query('
    for $bf in /ENVELOPE/BILLFIXED
    let $nxt := (/ENVELOPE/BILLFIXED[. >> $bf])[1]
    return
      <ENVELOPE>
        {$bf}
        {
            if ($nxt) then
                /ENVELOPE/*[. >> $bf][. << $nxt]
            else
                /ENVELOPE/*[. >> $bf]
        }
      </ENVELOPE>
  ')
) ) v(transformed)
CROSS APPLY v.transformed.nodes('/ENVELOPE') AS t(c)
CROSS APPLY t.c.nodes('BILLFIXED') t2(f);

db<>fiddle

Steps are as follows:

  • Use .query to create a new XML document.
  • For each BILLFIXED node, store that in $bf
  • Store the BILLFIXED node which follows $bf in $nxt
  • Return an ENVELOPE node, containing $bf as well as all nodes which...
    • ... are after $bf
    • ... and before $nxt if there is a $nxt
  • Query that result as normal, using .nodes

Upvotes: 2

lptr
lptr

Reputation: 6798

Pivot the xml elements grouped by the number of preceding BILLFIXED nodes, fiddle.

declare @xml
xml= '<ENVELOPE>'+
replicate( cast('
    <BILLFIXED>
        <BILLDATE>29-Jun-2019</BILLDATE>
        <BILLREF>123</BILLREF>
        <BILLPARTY>ABC</BILLPARTY>
    </BILLFIXED>
    <BILLOP>200</BILLOP>
    <BILLCL>200</BILLCL>
    <BILLDUE>29-Jun-2019</BILLDUE>
    <BILLOVERDUE>1116</BILLOVERDUE>
    <BILLFIXED>
        <BILLDATE>30-Jun-2019</BILLDATE>
        <BILLREF>April To June -19</BILLREF>
        <BILLPARTY>efg</BILLPARTY>
    </BILLFIXED>
    <BILLOP>100</BILLOP>
    <BILLCL>100</BILLCL>
    <BILLDUE>30-Jun-2019</BILLDUE>
    <BILLOVERDUE>1115</BILLOVERDUE>' as nvarchar(max)), 100) +'
</ENVELOPE>';


select 
  max(case when col = 'BILLDATE' then val end) as BILLDATE,
  max(case when col = 'BILLREF' then val end) as BILLREF,
  max(case when col = 'BILLPARTY' then val end) as BILLPARTY,
  max(case when col = 'BILLOP' then val end) as BILLOP,
  max(case when col = 'BILLCL' then val end) as BILLCL,
  max(case when col = 'BILLDUE' then val end) as BILLPARTY,
  max(case when col = 'BILLOVERDUE' then val end) as BILLOVERDUE
from
(
select 
  b.b.value('local-name(.)', 'nvarchar(200)') as col, b.b.value('text()[1]', 'nvarchar(200)') as val,
  sum(case when b.b.value('local-name(.)', 'nvarchar(200)') = 'BILLFIXED' then 1 else 0 end) 
  over(order by b.b rows unbounded preceding) as grp
from @xml.nodes('/ENVELOPE//*') as b(b)
) as x
group by grp;

Upvotes: 1

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22212

Microsoft proprietary OPENXML() and its companions sp_xml_preparedocument and sp_xml_removedocument are kept just for backward compatibility with the obsolete SQL Server 2000. Their use is diminished just to very few fringe cases. Starting from SQL Server 2005 onwards, it is strongly recommended to re-write your SQL and switch it to XQuery. Also, OPENXML() cannot take advantage of XML indexes while XQuery methods can.

Please try the following solution.

The XML sample is flattened.

So, we are using full power of XQuery.

SQL

DECLARE @xml XML = 
N'<ENVELOPE>
    <BILLFIXED>
        <BILLDATE>29-Jun-2019</BILLDATE>
        <BILLREF>123</BILLREF>
        <BILLPARTY>ABC</BILLPARTY>
    </BILLFIXED>
    <BILLOP>200</BILLOP>
    <BILLCL>200</BILLCL>
    <BILLDUE>29-Jun-2019</BILLDUE>
    <BILLOVERDUE>1116</BILLOVERDUE>
    <BILLFIXED>
        <BILLDATE>30-Jun-2019</BILLDATE>
        <BILLREF>April To June -19</BILLREF>
        <BILLPARTY>efg</BILLPARTY>
    </BILLFIXED>
    <BILLOP>100</BILLOP>
    <BILLCL>100</BILLCL>
    <BILLDUE>30-Jun-2019</BILLDUE>
    <BILLOVERDUE>1115</BILLOVERDUE>
</ENVELOPE>';

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS Nmbr 
    , c.value('(./BILLDATE/text())[1]', 'VARCHAR(20)') AS BILLDATE
    , c.value('(./BILLREF/text())[1]', 'VARCHAR(20)') AS BILLREF
    , c.value('(./BILLPARTY/text())[1]', 'VARCHAR(20)') AS BILLPARTY
    , c.value('(/ENVELOPE/*[sql:column("seq.pos")]/text())[1]', 'INT') AS BILLOP
    , c.value('(/ENVELOPE/*[sql:column("seq.pos") + 1]/text())[1]', 'INT') AS BILLCL
    , c.value('(/ENVELOPE/*[sql:column("seq.pos") + 2]/text())[1]', 'VARCHAR(20)') AS BILLDUE
    , c.value('(/ENVELOPE/*[sql:column("seq.pos") + 3]/text())[1]', 'INT') AS BILLOVERDUE
    , seq.pos   -- just to see
FROM @xml.nodes('/ENVELOPE/BILLFIXED') AS t(c)
   CROSS APPLY (SELECT t.c.value('let $n := . return count(/ENVELOPE/*[. << $n[1]]) + 2','INT') AS pos
         ) AS seq;

Output

Nmbr BILLDATE BILLREF BILLPARTY BILLOP BILLCL BILLDUE BILLOVERDUE pos
1 29-Jun-2019 123 ABC 200 200 29-Jun-2019 1116 2
2 30-Jun-2019 April To June -19 efg 100 100 30-Jun-2019 1115 7

Upvotes: 1

Related Questions