zening.chen
zening.chen

Reputation: 127

Get the name of tag and attribute in SQL Server using OPENXML

I have to process an XML file with following format:

<Root>
<A name="x1">
    <B exp="h1" ref="r1"/>
    <C exp="h2" ref="r2" rat = "ra1"/>
    <D exp="h3" ref="r3"/>
</A>
<A name="x2">
    <E exp="h4" ref="r4"/>
    <F exp="h5" ref="r5"/>
</A>
</Root>

I want to write an stored procedure to get a table like:

|A_name|tag_name|attrbute|val|
|x1    |   B    |exp|h1|
|x1    |   B    |ref|r1|
|x1    |   C    |exp|h2|
|x1    |   C    |rat|ra1|
|x1    |   C    |ref|r2|
|x1    |   D    |exp|h3|
|x1    |   D    |ref|r3|
|x2    |   E    |exp|h4|
|x2    |   E    |ref|r4|
|x2    |   F    |exp|h5|
|x2    |   F    |ref|r5|

How can I do? I've already try

SELECT localname
FROM OPENXML(@idoc, '/Root/A') 
WHERE localname!='A'

to get the tag name. However, I am not able to join them with attribute of tag A.

I found out that the performance of using nodes() combined with value() is much worse than OPENXML. So even OPENXML is outdated, I still want to use it since I have GBs XML result to read.

Upvotes: 0

Views: 389

Answers (1)

AlwaysLearning
AlwaysLearning

Reputation: 8829

I'd be surprised to see OPENXML outperform well-written nodes() and value() code when processing large amounts of data. OPENXML is an extremely RBAR process requiring you to call two stored procedures and a UDF for each XML document to be processed. If you have poorly performing code using nodes() and value() it probably has back-tracking references (i.e.: '../') and other such performance killers.

Nevertheless, here's some OPENXML craziness that will produce the result set you're looking for...

declare @idoc int, @doc nvarchar(max) = N'<Root>
<A name="x1">
    <B exp="h1" ref="r1"/>
    <C exp="h2" ref="r2" rat="ra1"/>
    <D exp="h3" ref="r3"/>
</A>
<A name="x2">
    <E exp="h4" ref="r4"/>
    <F exp="h5" ref="r5"/>
</A>
</Root>';

exec sp_xml_preparedocument @idoc output, @doc; 

;with rowset as (
  select id, parentid, nodetype, localname, prefix, namespaceuri, datatype, prev, [text]
  from openxml(@idoc, '/Root/A')
)
select A_name, tag_name, attribute, val
from rowset Router
outer apply (
  select A_name_id=Rinner.id
  from rowset Rinner
  where Rinner.nodetype=2 and Rinner.parentid=Router.id and Rinner.localname=N'name'
) NameAttributes
outer apply (
  select A_name=Rinner.[text]
  from rowset Rinner
  where Rinner.nodetype=3 and Rinner.parentid=NameAttributes.A_name_id and Rinner.localname=N'#text'
) NameValues
cross apply (
  select tag_id=Rinner.id, tag_name=Rinner.localname
  from rowset Rinner
  where Rinner.nodetype=1 and Rinner.parentid=Router.id
) Children
cross apply (
  select attribute_id=Rinner.id, attribute=Rinner.localname
  from rowset Rinner
  where Rinner.nodetype=2 and Rinner.parentid=Children.tag_id
) ChildAttributes
outer apply (
  select val=Rinner.[text]
  from rowset Rinner
  where Rinner.nodetype=3 and Rinner.parentid=ChildAttributes.attribute_id
) ChildAttributeValues
where nodetype=1 and parentid=0
order by Router.id, Children.tag_id, ChildAttributes.attribute

exec sp_xml_removedocument @idoc;
A_name  tag_name    attribute   val
x1      B           exp         h1
x1      B           ref         r1
x1      C           exp         h2
x1      C           rat         ra1
x1      C           ref         r2
x1      D           exp         h3
x1      D           ref         r3
x2      E           exp         h4
x2      E           ref         r4
x2      F           exp         h5
x2      F           ref         r5

Upvotes: 1

Related Questions