Weizen
Weizen

Reputation: 263

Select a Node which has specified subnodes

I have to write a web scraper. My php page is:

<a href="Something.php">
<div class="SPECIFIEDCLASS" title="other something">
</div>
</a>

What I wrote so far is:

var diiv = doc.DocumentNode.SelectNodes("//a/div[@class='SPECIFIEDCLASS']");

var hrefLiist = diiv.Select(q => q.GetAttributeValue("href", "not found")).ToList()

but its not working.

Upvotes: 0

Views: 37

Answers (1)

Fᴀʀʜᴀɴ Aɴᴀᴍ
Fᴀʀʜᴀɴ Aɴᴀᴍ

Reputation: 6251

Your XPath expression selects div tags with the specified class within a tags. But what you want are the a tags with div tags with the specified class. You should instead use this XPath expression:

var diiv = doc.DocumentNode.SelectNodes("//a[div[@class='SPECIFIEDCLASS']]");

For a more visual explanation:

Your XPath does this to each a tag:

  • Get a tag.
  • Get child div tag.
  • Select div tags with Class = "SPECIFIEDCLASS". So ultimately, the div tags are themselves selected

The correct XPath should do this:

  • Get a tag.
  • Select a tags where:
    Child div tag has Class = "SPECIFIEDCLASS". Here the a tags are selected.

Upvotes: 1

Related Questions