Select a Node which has specified subnodes

Question

I have to write a web scraper. My php page is:

What I wrote so far is:

var diiv = doc.DocumentNode.SelectNodes("//a/div[@class='SPECIFIEDCLASS']");

var hrefLiist = diiv.Select(q => q.GetAttributeValue("href", "not found")).ToList()

but its not working.

Fᴀʀʜᴀɴ Aɴᴀᴍ · Accepted Answer

Your XPath expression selects div tags with the specified class within a tags. But what you want are the a tags with div tags with the specified class. You should instead use this XPath expression:

var diiv = doc.DocumentNode.SelectNodes("//a[div[@class='SPECIFIEDCLASS']]");

For a more visual explanation:

Your XPath does this to each a tag:

Get a tag.
Get child div tag.
Select div tags with Class = "SPECIFIEDCLASS". So ultimately, the div tags are themselves selected

The correct XPath should do this:

Get a tag.
Select a tags where:
Child div tag has Class = "SPECIFIEDCLASS". Here the a tags are selected.

Select a Node which has specified subnodes

Answers (1)

Related Questions