Reputation: 12525
I am very new to HTML Agility Pack. I am trying to find some documentation but having some issues.
I have the following code:
<div class="person">
<a href="blah1.html">Person 1</a>
</div>
<div class="person">
<a href="blah2.html">Person 2</a>
</div>
<div class="person">
<a href="blah3.html">Person 3</a>
</div>
<div class="person">
<a href="blah4.html">Person 4</a>
</div>
Using the parser, how can I only grab links within a div that has a class person?
Thank you!
Upvotes: 2
Views: 2046
Reputation: 498904
The following XPath corresponds to your description:
//div[@class='person']/a/@href
It will return the href
attributes of the first a
elements that reside directly under any div
element with the class
attribute that is equal to person
.
If you are more comfortable with jQuery style selectors, take a look at using CsQuery instead of the HTML Agility Pack.
Upvotes: 2
Reputation: 236188
With Html Agility Pack (available on NuGet):
HtmlDocument html = new HtmlDocument();
html.Load(path_to_html); // or html.LoadHtml(html_string)
var links = html.DocumentNode.SelectNodes("//div[@class='person']/a")
.Select(n => n.GetAttributeValue("href", null));
Returns:
"blah1.html"
"blah2.html"
"blah3.html"
"blah4.html"
Upvotes: 3