Jorge E. Hernández
Jorge E. Hernández

Reputation: 2938

How to retrieve rows with XML column that have specific child node?

I have a table in SQL Server with an XML column:

CREATE TABLE dbo.XmlData (
    id INT IDENTITY,
    data XML
);

The XML stored in the data column is dynamic, I mean, only some of them have certain child node. Additionally, this nodes appear in different locations within the XML structure.

Example 1, /RootNode/FindMe:

<RootNode>
    <ChildNodeOne>One</ChildNodeOne>
    <ChildNodeTwo>Two</ChildNodeTwo>
    <ChildNodeThree>Three</ChildNodeThree>
    <FindMe>FindMe</FindMe>
</RootNode>

Example 2, /RootNode/ChildNodeThree/Deeper/FindMe and /RootNode/ChildNodeTwo/FindMe :

<RootNode>
    <ChildNodeOne>One</ChildNodeOne>
    <ChildNodeTwo>
        <FindMe>FindMe</FindMe>
    </ChildNodeTwo>
    <ChildNodeThree>
        <Deeper>
            <FindMe>FindMe</FindMe>
        </Deeper>
    </ChildNodeThree>
</RootNode>

There are rows with the XML column that doesn't have the <FindMe/> node.

I need to write a query to retrieve only the rows that have the <FindMe/> node. No matter its location within the XML structure.

Upvotes: 0

Views: 241

Answers (1)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67291

The first thing to know is, that //FindMe will trigger a deep search. That means: Find this node whereever it exists in the XML. A single / marks the root element (Start the search at the beginning) and no slash at all marks the current node (Continue from the context node):

In the following there are some examples how one could query for a node name:

A dummy table for tests:

CREATE TABLE dbo.DummyTbl (
    id INT IDENTITY,
    Remark VARCHAR(100),
    data XML
);

INSERT INTO DummyTbl VALUES
('FindMe is on second level',
'<RootNode>
    <ChildNodeOne>One</ChildNodeOne>
    <ChildNodeTwo>Two</ChildNodeTwo>
    <ChildNodeThree>Three</ChildNodeThree>
    <FindMe>FindMe 1</FindMe>
</RootNode>')
,('FindMe is two times somewhere deeper',
'<RootNode>
    <ChildNodeOne>One</ChildNodeOne>
    <ChildNodeTwo>
        <FindMe>FindMe 2a</FindMe>
    </ChildNodeTwo>
    <ChildNodeThree>
        <Deeper>
            <FindMe>FindMe 2b</FindMe>
        </Deeper>
    </ChildNodeThree>
</RootNode>')
,('FindMe does not exist',
'<RootNode>
    <ChildNodeOne>One</ChildNodeOne>
    <ChildNodeTwo>
    </ChildNodeTwo>
    <ChildNodeThree>
        <Deeper>
            <FindMeNot>Something else</FindMeNot>
        </Deeper>
    </ChildNodeThree>
</RootNode>')
,('FindMe exists, but is empty',
'<RootNode>
    <ChildNodeOne>One</ChildNodeOne>
    <ChildNodeTwo>
    </ChildNodeTwo>
    <ChildNodeThree>
        <Deeper>
            <FindMe/>
        </Deeper>
    </ChildNodeThree>
</RootNode>');

Different queries:

--All FindMe nodes (inlcuding the empty one)

SELECT d.id,d.Remark
      ,f.value('.','nvarchar(max)') FindMeNode
FROM dbo.DummyTbl AS d
CROSS APPLY d.data.nodes('//FindMe') AS A(f)

--All IDs where there is at least one "FindMe"-node

SELECT d.id,d.Remark
FROM dbo.DummyTbl AS d
WHERE d.data.exist('//FindMe')=1
GO

--All IDs where there is at least one "FindMe"-node, but not the empty one

SELECT d.id,d.Remark
FROM dbo.DummyTbl AS d
WHERE d.data.exist('//FindMe/text()')=1

--Find IDs, where there is at least one empty FindMe node

SELECT d.id,d.Remark
FROM dbo.DummyTbl AS d
WHERE d.data.exist('//FindMe[empty(text())]')=1

--Now with a variable name to search for

DECLARE @SearchForName NVARCHAR(100)='FindMe'; 
SELECT d.id,d.Remark
FROM dbo.DummyTbl AS d
WHERE d.data.exist('//*[local-name()=sql:variable("@SearchForName")]/text()')=1
GO

--Clean up

DROP TABLE dbo.DummyTbl;

Upvotes: 1

Related Questions