Reputation: 405
I'm looking to get the output:
50ml milk
From the following code:
<ul class="ingredients-list__group">
<li>50ml <a href="/glossary/milk" class="tooltip-processed">milk
<div class="tooltip">
<h2
class="node-title">Milk</h2> <span class="fonetic">mill-k</span>
<p>One of the most widely used ingredients, milk is often referred to as a complete food. While cow…</p>
</div>
</a>
</li>
</ul>
Currently I'm using the XPATH:
//ul[@class="ingredients-list__group"]/li
But getting:
50ml milk Milk mill-kOne of the most widely used ingredients, milk is often referred to as a complete food. While cow…
How do I exclude the stuff within the div/tooltip?
Upvotes: 2
Views: 2353
Reputation: 92854
With xpath
2.0:
//ul[@class="ingredients-list__group"]/li/concat(./text()[1], ./a/text()[1])
With xpath
1.0:
concat(//ul[@class="ingredients-list__group"]/li/text()[1], //ul[@class="ingredients-list__group"]/li/a/text()[1])'
Upvotes: 2
Reputation: 163262
You can select the relevant text nodes using
//ul[@class="ingredients-list__group"]//
text()[not(ancestor::div[@class='tooltip'])]
If you're in XPath 2.0 you can then put this in a call of string-join() to join these into a single string. If you're stuck with 1.0, you'll have to return multiple text nodes to the calling application and concatenate them together in the host language code.
Upvotes: 0