Evgeniy
Evgeniy

Reputation: 2605

How to match ancestor-or-self for contains Xpath?

I try to match ancestor-or-self of any element containing certain text string:

In step 1 matching of elements containing text works: //*[contains(text(),"ABC")].

But I struggle with the syntax of adding an ancestor. I tried //*ancestor-or-self::[contains(text(),"ABC")] and //*[contains(text(),"ABC")]/ancestor-or-self without success.

What is the correct syntax for this?

The code and string I want match can look like:

<p><strong>Vertreten durch:</strong><br>Max Mustermann</p>

So I look for the string Vertreten durch to catch the parent element <p>...</p>

Upvotes: 1

Views: 144

Answers (3)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243619

The code and string I want match can look like:

<p><strong>Vertreten durch:</strong><br>Max Mustermann</p> So I look for the string Vertreten durch to catch the parent element <p>...</p>

This XPath expression:

//*[not(text()[2]) and contains(text()[1], 'Vertreten durch')]//ancestor-or-self::p[1]

when evaluated selects the closest <p> - ancestor(-or-self) of the element (in this case <strong> whose only text-child-node contains the specified string: "Vertreten durch".


XSLT- based verification:

Given this source XML document (the provided one, corrected to a well-formed XML document):

<p><strong>Vertreten durch:</strong><br />Max Mustermann</p>

This transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="/">
    <xsl:copy-of select=
    "//*[not(text()[2]) and contains(text()[1], 'Vertreten durch')]//ancestor-or-self::p[1]"/>
  </xsl:template>
</xsl:stylesheet>

evaluates the XPath expression and outputs the node that is selected by it:

<p>
   <strong>Vertreten durch:</strong>
   <br/>Max Mustermann</p>

Upvotes: 2

Michael Kay
Michael Kay

Reputation: 163675

Don't try to process text nodes; use the string value of elements instead.

If the string value of an element E contains the substring Vertreten durch:, then the string value of all ancestors of E also contains this substring. So I think you simply need

//*[contains(., 'Vertreten durch:')]

If that doesn't answer the question, then the question needs to be clearer. An example would help.

Upvotes: 0

Luuk
Luuk

Reputation: 14999

I created an example xml, and named it test.xml:

<root>
   <line1>
       <line11>A</line11>
       <line12>B</line12>
   </line1>
   <line2>
       <line21>C</line21>
       <line22>D</line22>
   </line2>
</root>

Using xmlstarlet, you can do:

D:\TEMP>xml sel -t -m "//*[contains(text(),'D')]/ancestor-or-self::*" -v "name()" -n test.xml
root
line2
line22

D:\TEMP>xml sel -t -m "//*[contains(text(),'A')]/ancestor-or-self::*" -v "name()" -n test.xml
root
line1
line11

D:\TEMP>xml sel -C -t -m "//*[contains(text(),'A')]/ancestor-or-self::*" -v "name()" -n test.xml
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt">
  <xsl:output omit-xml-declaration="yes" indent="no"/>
  <xsl:template match="/">
    <xsl:for-each select="//*[contains(text(),'A')]/ancestor-or-self::*">
      <xsl:call-template name="value-of-template">
        <xsl:with-param name="select" select="name()"/>
      </xsl:call-template>
      <xsl:value-of select="'&#10;'"/>
    </xsl:for-each>
  </xsl:template>
  <xsl:template name="value-of-template">
    <xsl:param name="select"/>
    <xsl:value-of select="$select"/>
    <xsl:for-each select="exslt:node-set($select)[position()&gt;1]">
      <xsl:value-of select="'&#10;'"/>
      <xsl:value-of select="."/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

D:\TEMP>

EDIT: With a minimum example from HTML (I made sure it is also valid XML):

D:\TEMP>type test.html
<html>
<head>
   <title>test</title>
</head>
<body>
   <p><strong>Vertreten durch:</strong><br />Max Mustermann</p>
</body>
</html>
D:\TEMP>xml sel -t -m //*[contains(text(),'Vertreten')] -c .. -n test.html
<p><strong>Vertreten durch:</strong><br/>Max Mustermann</p>

D:\TEMP>xml sel -C -t -m //*[contains(text(),'Vertreten')] -c .. -n test.html
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="yes" indent="no"/>
  <xsl:template match="/">
    <xsl:for-each select="//*[contains(text(),'Vertreten')]">
      <xsl:copy-of select=".."/>
      <xsl:value-of select="'&#10;'"/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

Upvotes: 0

Related Questions