Smith
Smith

Reputation: 5959

get text of all <p> in div htmlagilitypack

i have a div that which contains paragraph tags like this

<div class="div_5">
                <p>First Paragraph</p>
                <p>Second Paragraph</p>
                <p>Third Paragraph</p>
                <p>Fourth Paragraph</p>
 </div>
<div class="div_5">
                <p>First Paragraph</p>
                <p>Second Paragraph</p>
                <p>Third Paragraph</p>
                <p>Fourth Paragraph</p>
 </div>

i need to get the text of all paragrap text using htmlagiitypack i tried this,

Dim oPB As HAP.HtmlNodeCollection = doc.DocumentNode.SelectNodes("//div[@class='post-bodycopy clearfix']/child::text()/"]
For Each item As HAP.HtmlNode In oPB
                    debug.print(item.InnerText)
                Next

the output am expecting for each div string is

First Paragraph
Second Paragraph
Third Paragraph
Fourth Paragraph

but am getting some html in the text returned, can someone help me correct the problem

Upvotes: 1

Views: 4062

Answers (1)

Jeff Mercado
Jeff Mercado

Reputation: 134621

You have to actually select the paragraphs' inner text. Your xpath gets something else completely.

Dim query = doc.DocumentNode.SelectNodes("//div[@class='div_5']/p/text()")

Upvotes: 3

Related Questions