Slayner
Slayner

Reputation: 409

AgilityPack getting last <p> of DOM three

Admitting a HTML like this :

<p>hello<p>
<p>
   <table>
      <tbody>
         <tr>
            <td>
               <p>is it me you're looking for</p>
            </td>
         </tr>
         <tr>
            <td>
               <p>can you have me too?</p>
            </td>
         </tr>
      </tbody>
    </table>
</p>

What I'd like is to get the innerText of my P element, but I got a trouble regarding the table part. When I use a loop the go throught all the P I got 4 innerText :

  1. hello
  2. is it me you're looking for can you have me too?
  3. is it me you're loogink for
  4. can you have me too?

In this case I would like not to get the P around the table as I already get them by looping on his descendant children inside the TD. How can I select the P element with Agility pack to only get the P element if there is other P as his children ? (So the result on the loop will only be 1,3,4) ?

I actually get the P element using :

HtmlDocument html = new HtmlDocument();
var pTag = html.DocumentNode.SelectNodes(".//p");

Upvotes: 0

Views: 38

Answers (1)

Keith Hall
Keith Hall

Reputation: 16075

The XPath .//p[not(descendant::p)] will get 1, 3 and 4 from your example. It finds all p elements and then skips the ones that have a p descendant.

Upvotes: 1

Related Questions