Reputation: 11171

PHP DomXPath not selecting empty text nodes

I'm trying to select nodes which don't contain any text. This bit of php code skips the empty node in the sample xml. However, when I try an online tester (like http://freeformatter.com/xpath-tester.html) it doesn't have any problem.

Is this a PHP thing?

My php code:

    $path = "//RecipeSteps/RecipeStep[not(text())]";
    $stepsQuery = $this->xpath->query($path);
    $numResults = $stepsQuery->length;

My sample xml:

<?xml version="1.0" encoding="utf-8"?>
<Recipes>
    <RecipeSteps>
      <RecipeStep number="1">Dummy content</RecipeStep>
      <RecipeStep number="2">Dummy content</RecipeStep>
      <RecipeStep number="3">Dummy content</RecipeStep>
      <RecipeStep number="4">Dummy content</RecipeStep>
      <RecipeStep number="5">Dummy content</RecipeStep>
      <RecipeStep number="6"></RecipeStep>
      <RecipeStep number="7">Variations</RecipeStep>
      <RecipeStep number="8">Some variation content..</RecipeStep>
    </RecipeSteps>
</Recipes>

Upvotes: 0

Answers (3)

Mathias Müller

Reputation: 22617

The path expressions //RecipeStep[not(text())] and //RecipeStep[string-length() = 0] do not mean the same, but taking as input the document you have shown, they return exactly the same. In both cases, one RecipeStep node is selected as the result:

<RecipeStep number="6"/>

//RecipeStep[not(text())] means, in plain English:

Select element nodes called RecipeStep anywhere in the document, but only if they do not have any immediate child text nodes.

On the other hand, //RecipeStep[string-length() = 0] means

Select element nodes called RecipeStep anywhere in the document, but only if the length of their string value (the concatenation of all descendant text nodes) is equal to 0.

The difference would only be apparent if recipe step number 6 actually looked like

<RecipeStep number="6"><child>text</child></RecipeStep>

Then, //RecipeStep[not(text())] would still select this node, whereas //RecipeStep[string-length() = 0] would not return anything.

(And just to make it clear: the leading //RecipeSteps that I have omitted does not change anything.)

So, your original XPath expression is correct - and the accepted answer does exactly the same as your original one. XPath ist not at fault here.

Upvotes: 0

felipsmartins

Reputation: 13549

When selecting full path it work:

$xmlString = '<?xml version="1.0" encoding="utf-8"?>
<Recipes>
    <RecipeSteps>
      <RecipeStep number="1">Dummy content</RecipeStep>
      <RecipeStep number="2">Dummy content</RecipeStep>
      <RecipeStep number="3">Dummy content</RecipeStep>
      <RecipeStep number="4">Dummy content</RecipeStep>
      <RecipeStep number="5">Dummy content</RecipeStep>
      <RecipeStep number="6"></RecipeStep>
      <RecipeStep number="7">Variations</RecipeStep>
      <RecipeStep number="8">Some variation content..</RecipeStep>
    </RecipeSteps>
</Recipes>';

$dom = new DOMDocument();
$dom->loadXML($xmlString);
$xpath = new DOMXpath($dom);
# it works also well: //RecipeSteps/RecipeStep[not(text())]
$query = $xpath->query('//Recipes/RecipeSteps/RecipeStep[not(text())]');
//returns "6"
print 'RecipeStep number: ' . $query->item(0)->getAttribute('number');

Also, selecting "//RecipeSteps/RecipeStep[not(text())]" works like a charm also well. So most likely you're doing something wrong.

Upvotes: 0

MadsBjaerge

Reputation: 126

If you are looking for a XPATH solution, use //RecipeSteps/(RecipeStep[string-length() = 0]). e.g

$path = "//RecipeSteps/(RecipeStep[string-length() = 0])";
$stepsQuery = $this->xpath->query($path);
$numResults = $stepsQuery->length;

Upvotes: 1

PHP DomXPath not selecting empty text nodes

Answers (3)

Related Questions