AbtPst
AbtPst

Reputation: 8008

SQL Server : How to use XPATH in select query

I have a table in SQL Server which has a column with XML data type. For example, one value looks like

<doc>

<q></q>

<p1>
    <p2 dd="ert" ji="pp">

        <p3>1</p3>
        <p3>2</p3>
        <p3>XYZ</p3>
        <p3>3</p3>

     </p2>

     <p2 dd="ert" ji="pp">

        <p3>4</p3>
        <p3>5</p3>
        <p3>ABC</p3>
        <p3>6</p3>

     </p2>

</p1>
<r></r>
<p1>
    <p2 dd="ert" ji="pp">

        <p3>7</p3>
        <p3>8</p3>
        <p3>ABC</p3>
        <p3>9</p3>

     </p2>

     <p2 dd="ert" ji="pp">

        <p3>10</p3>
        <p3>11</p3>
        <p3>XYZ</p3>
        <p3>12</p3>

     </p2>

</p1>
</doc>

now, i would want to execute the following xpath

./doc//p1/p2/p3[contains(text(),'ABC') or contains(text(),'XYZ')]/preceding-sibling::p3

so the result is

1 2 4 5 7 8 10 11

so i want to select all <p3> nodes that are siblings of the <p3> nodes with text ABC or XYZ. Also, these <p3> nodes must be within <p2> nodes that are themselves within <p1> nodes. As of now, i just select the column as it is and then use the xpath in python to get the nodes. Is there a way to get the required nodes using the SQL select query?

EDIT

Note that it would suffice even if i got

<p3>1</p3>
        <p3>2</p3>
<p3>4</p3>
        <p3>5</p3>
<p3>7</p3>
        <p3>8</p3>
<p3>10</p3>
        <p3>11</p3>

because then i can easily use an XML parser to get the text.

Upvotes: 3

Views: 6688

Answers (2)

har07
har07

Reputation: 89285

There are some limitations to be considered in the SQL Server's XPath/XQuery, i.e neither following-sibling nor preceding-sibling are supported. As an alternative, you can use >> or << operator to compare sibling elements position. See the demo below.

Input XML in an XML variable :

declare @xml XML = '<?xml version="1.0" encoding="UTF-8"?>
<doc> 
  <q/>  
  <p1> 
    <p2 dd="ert" ji="pp"> 
      <p3>1</p3>  
      <p3>2</p3>  
      <p3>XYZ</p3>  
      <p3>3</p3> 
    </p2>  
    <p2 dd="ert" ji="pp"> 
      <p3>4</p3>  
      <p3>5</p3>  
      <p3>ABC</p3>  
      <p3>6</p3> 
    </p2> 
  </p1>  
  <r/>  
  <p1> 
    <p2 dd="ert" ji="pp"> 
      <p3>7</p3>  
      <p3>8</p3>  
      <p3>ABC</p3>  
      <p3>9</p3> 
    </p2>  
    <p2 dd="ert" ji="pp"> 
      <p3>10</p3>  
      <p3>11</p3>  
      <p3>XYZ</p3>  
      <p3>12</p3> 
    </p2> 
  </p1> 
</doc>'

Query :

SELECT @xml.query('
/doc/p1/p2/p3[. << ../p3[contains(.,"ABC") or contains(.,"XYZ")][last()]]
')

output : (tested in SQL Server 2008R2)

<p3>1</p3>
<p3>2</p3>
<p3>4</p3>
<p3>5</p3>
<p3>7</p3>
<p3>8</p3>
<p3>10</p3>
<p3>11</p3>

Upvotes: 1

Mathias M&#252;ller
Mathias M&#252;ller

Reputation: 22617

Think from another perspective. Start the XPath expression with the p3 elements you are actually interested in. Use the following expression:

/doc/p1/p2/p3[following-sibling::p3 = 'ABC' or following-sibling::p3 = 'XYZ']

or even

/doc/p1/p2/p3[following-sibling::p3[. = 'ABC' or . = 'XYZ']]

which will retrieve (individual results separated by ---------):

<p3>1</p3>
-----------------------
<p3>2</p3>
-----------------------
<p3>4</p3>
-----------------------
<p3>5</p3>
-----------------------
<p3>7</p3>
-----------------------
<p3>8</p3>
-----------------------
<p3>10</p3>
-----------------------
<p3>11</p3>

Upvotes: 1

Related Questions