getting text from html document using HtmlAgilityPack via XPath

Question

I have following html in a file, I am loading this file into an HTMLDocument using HtmlAgilityPack.

The problem is that I only want to get Hello World! using XPath and not the inner text.

How do I achieve this?


    
        Hello world!
        
            
                Welcome to planet!

dash · Accepted Answer

The XPath:

//ul/li[1]/text()

Should select the actual text "Hello World!"

You can then select the value of this node.

In use:

string text = doc.DocumentElement.SelectSingleNode("//ul/li[1]/text()").Value;

In essence, what this says is navigate to a ul node, select the first li, and then select the text() node.

Answers (2)