Reputation: 1350
I need to expand an xml in SQL server as table. I do that using XQuery
with nodes()
and .query()
. But I need that each node has a sequential number and I also need to filter nodes based on their internal structure and I don't figure out how to do that.
I need following result:
------------------------
| 1 | <xml node ...> |
------------------------
| 2 | <xml node ...> |
------------------------
...
I have following XML:
<node xmlns="http://mynamespace.com/ns/">
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Domain</Tag>
<Value>dom</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Wdth</Tag>
<Value>1</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Code</Tag>
<Value>TEST</Value>
</subnode>
</node>
<node xmlns="http://mynamespace.com/ns/">
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Domain</Tag>
<Value>dom</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Wdth</Tag>
<Value>1</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Code</Tag>
<Value></Value>
</subnode>
</node>
I need when <Tag>Code</Tag>
has value empty (<Value></Value>
) to skip this node.
I use the following code, but I can not figure out how to make sequential numbers or how to filter:
DECLARE @XMLInput XML = '<node xmlns="http://mynamespace.com/ns/">
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Domain</Tag>
<Value>dom</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Wdth</Tag>
<Value>1</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Code</Tag>
<Value>TEST</Value>
</subnode>
</node>
<node xmlns="http://mynamespace.com/ns/">
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Domain</Tag>
<Value>dom</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Wdth</Tag>
<Value>1</Value>
</subnode>
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Code</Tag>
<Value></Value>
</subnode>
</node>';
SELECT
Child.query('declare default element namespace "http://mynamespace.com/ns/"; (.)') AS node
FROM
@XMLInput.nodes('declare default element namespace "http://mynamespace.com/ns/"; (/node)') AS N(Child)
EDIT:
Because there are unclear elements I clarify. I need to filter out entire <node>
node when there is a <subnode>
node with <Tag>
node with value "Code"
and corresponding <Value>
node which is empty. In this case I need whole <node>
to be removed - not visible.
Upvotes: 1
Views: 589
Reputation: 67311
The first part is rather easy, use ROW_NUMBER() OVER()
. As XMLs have an implicit sort order, we can use (SELECT NULL)
. Rows will appear according to their physical order within the XML:
With this code you will get your nodes numbered:
WITH XMLNAMESPACES(DEFAULT 'http://mynamespace.com/ns/')
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS NodeNr
,Child.query('.') AS node
FROM
@XMLInput.nodes('/node') AS N(Child)
Your second part is not clear for me. Do you want to surpress the second node completely because there is
<subnode xmlns="http://mynamespace.com/ns/">
<Tag>Code</Tag>
<Value></Value>
</subnode>
? Or do you want to surpress just this <subnode>
?
The following would use the upper as derived table (a CTE
) and then use .exist()
on the column node
. This method merely checks the existance of any node according to the XQuery
expression. In this case I search for any <Tag>
with the text()="Code"
. From there we navigate one level up and search for a <Value>
element where the text()
is empty. If this exists, the function returns 1
, so we need those without:
WITH XMLNAMESPACES(DEFAULT 'http://mynamespace.com/ns/')
,shredded AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS NodeNr
,Child.query(N'.') AS node
FROM
@XMLInput.nodes(N'/node') AS N(Child)
)
SELECT *
FROM shredded
WHERE shredded.node.exist(N'//Tag[text()="Code"]/../Value[empty(text())]')=0
As @MartinSmith has pointed out, there was no proof for this
Rows will appear according to their physical order within the XML
In the meanwhile I've found this:
I found a paper on XML in SQL Server 2005. There you can find XML values are stored in an internal format ... in order to support the XML model characteristics, such as document order ... This is - at least - a hint, that the document's order is persisted. Reading XML with FROM OPENXML
will return a sorted table too (at least I didn't find any exceptions yet).
Further more I've found Understanding the XML data type. where you can find This internal representation includes information about the containment hierarchy, document order, and element and attribute values. Specifically, the InfoSet content of the XML data is preserved.
About InfoSet
I found this document, which states about [children]
: An ordered list of child information items, in document order.
Well, this is still no valid proof, that a SELECT
on .nodes()
will return the derived table in exactly the same order as within the XML under all circumstances. But - at least - it points to the fact, that the internal order is worth to be persisted.
My conclusio: The internal order is seen as inherent part of the XML document. That's why I'm pretty sure, that .nodes()
will return a derived table in the same order. Adding ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
should do nothing else than to add a running number to these rows.
Upvotes: 1