Reputation: 6305
i have the following code to for xpath query...
<div class="buying">
<h1 class="parseasinTitle ">
<span id="btAsinTitle">Top Ten Tips for Growing Your Own Tomatoes (The Basic Art of Italian Cooking) <span style="text-transform: capitalize; font-size: 16px;">[Kindle Edition]</span></span>
</h1>
</div>
i just want to extract
Top Ten Tips for Growing Your Own Tomatoes (The Basic Art of Italian Cooking)
so i am using textContent
with the following xpath query
$xpath_books->query('//span[@id="btAsinTitle"]')
but the result is
Top Ten Tips for Growing Your Own Tomatoes (The Basic Art of Italian Cooking) [Kindle Edition]
i think, i have to exclude <span style="text-transform: capitalize; font-size: 16px;">
, to get my purpose,
how can i do it ?
Upvotes: 2
Views: 578
Reputation: 317177
Your XPath does return the node with the id only, but because DOM is a tree of linked DOMNodes, the returned node will contain the child node. And when you access the returned span with nodeValue
or textContent
, PHP will return the combined DOMText nodes of all the children, including the child span holding "Kindle Edition".
SPAN
/ \
TEXT SPAN
\
TEXT
More on that at DOMDocument in php
If you want to fetch only the first text part, you have to fetch the nodeValue of the first childNode:
echo $result->item(0)->childNodes->item(0)->nodeValue;
An alternative to fetch that string with XPath directly would be
echo $xpath->evaluate('string(//span[@id="btAsinTitle"]/text())');
See http://php.net/manual/en/domxpath.evaluate.php
If you want to return the whole DOMText node instead, use
//span[@id="btAsinTitle"]/text()
Upvotes: 4