good-to-know
good-to-know

Reputation: 742

How to get the innertext alone without the child tags using HtmlAgilityPack?

I have an HTML page like below. I need to take the 'blah blah blah' alone from the 'span' tag.

<span class="news">
blah blah blah
<div>hello</div>
<div>bye</div> 
</span>

This gives me all values:

div.SelectSingleNode(".//span[@class='news']").InnerText.Trim();

This gives me null:

div.SelectSingleNode(".//span[@class='news']/preceding-sibling::text()").InnerText.Trim();

How do I get the text before the 'div' tag using HtmlAgilityPack?

Upvotes: 9

Views: 4148

Answers (1)

har07
har07

Reputation: 89335

Your 2nd try was pretty close. Use /text() instead of /preceding-sibling::text(), because the text node is child of the span[@class='news'] not sibling (neither preceding nor following) :

div.SelectSingleNode(".//span[@class='news']/text()")
   .InnerText
   .Trim();

Upvotes: 13

Related Questions