Reputation: 127
I have to process an XML file with following format:
<Root>
<A name="x1">
<B exp="h1" ref="r1"/>
<C exp="h2" ref="r2" rat = "ra1"/>
<D exp="h3" ref="r3"/>
</A>
<A name="x2">
<E exp="h4" ref="r4"/>
<F exp="h5" ref="r5"/>
</A>
</Root>
I want to write an stored procedure to get a table like:
|A_name|tag_name|attrbute|val|
|x1 | B |exp|h1|
|x1 | B |ref|r1|
|x1 | C |exp|h2|
|x1 | C |rat|ra1|
|x1 | C |ref|r2|
|x1 | D |exp|h3|
|x1 | D |ref|r3|
|x2 | E |exp|h4|
|x2 | E |ref|r4|
|x2 | F |exp|h5|
|x2 | F |ref|r5|
How can I do? I've already try
SELECT localname
FROM OPENXML(@idoc, '/Root/A')
WHERE localname!='A'
to get the tag name. However, I am not able to join them with attribute of tag A.
I found out that the performance of using nodes() combined with value() is much worse than OPENXML. So even OPENXML is outdated, I still want to use it since I have GBs XML result to read.
Upvotes: 0
Views: 389
Reputation: 8829
I'd be surprised to see OPENXML
outperform well-written nodes()
and value()
code when processing large amounts of data. OPENXML
is an extremely RBAR process requiring you to call two stored procedures and a UDF for each XML document to be processed. If you have poorly performing code using nodes()
and value()
it probably has back-tracking references (i.e.: '../'
) and other such performance killers.
Nevertheless, here's some OPENXML
craziness that will produce the result set you're looking for...
declare @idoc int, @doc nvarchar(max) = N'<Root>
<A name="x1">
<B exp="h1" ref="r1"/>
<C exp="h2" ref="r2" rat="ra1"/>
<D exp="h3" ref="r3"/>
</A>
<A name="x2">
<E exp="h4" ref="r4"/>
<F exp="h5" ref="r5"/>
</A>
</Root>';
exec sp_xml_preparedocument @idoc output, @doc;
;with rowset as (
select id, parentid, nodetype, localname, prefix, namespaceuri, datatype, prev, [text]
from openxml(@idoc, '/Root/A')
)
select A_name, tag_name, attribute, val
from rowset Router
outer apply (
select A_name_id=Rinner.id
from rowset Rinner
where Rinner.nodetype=2 and Rinner.parentid=Router.id and Rinner.localname=N'name'
) NameAttributes
outer apply (
select A_name=Rinner.[text]
from rowset Rinner
where Rinner.nodetype=3 and Rinner.parentid=NameAttributes.A_name_id and Rinner.localname=N'#text'
) NameValues
cross apply (
select tag_id=Rinner.id, tag_name=Rinner.localname
from rowset Rinner
where Rinner.nodetype=1 and Rinner.parentid=Router.id
) Children
cross apply (
select attribute_id=Rinner.id, attribute=Rinner.localname
from rowset Rinner
where Rinner.nodetype=2 and Rinner.parentid=Children.tag_id
) ChildAttributes
outer apply (
select val=Rinner.[text]
from rowset Rinner
where Rinner.nodetype=3 and Rinner.parentid=ChildAttributes.attribute_id
) ChildAttributeValues
where nodetype=1 and parentid=0
order by Router.id, Children.tag_id, ChildAttributes.attribute
exec sp_xml_removedocument @idoc;
A_name tag_name attribute val
x1 B exp h1
x1 B ref r1
x1 C exp h2
x1 C rat ra1
x1 C ref r2
x1 D exp h3
x1 D ref r3
x2 E exp h4
x2 E ref r4
x2 F exp h5
x2 F ref r5
Upvotes: 1