Reputation: 267
I have this code:
foreach (HtmlNode node in hd.DocumentNode.SelectNodes("//div[@class='compTitle options-toggle']//a"))
{
string s=("node:" + node.GetAttributeValue("href", string.Empty));
}
I want to get urls in tags like this:
<div class="compTitle options-toggle">
<a class=" ac-algo fz-l ac-21th lh-24" href="http://www.bestbuy.com">
<b>Huawei</b> Products - Best Buy
</a>
</div>
I want to get "http://www.bestbuy.com" and "Huawei Products - Best Buy"
what should I do? Is my code correct?
Upvotes: 0
Views: 797
Reputation: 320
The closing double quote should fix the selecting (it worked for me).
Get the plain text as
string contentText = node.InnerText;
or having the Huawei word in bold, like this:
string contentHtml = node.InnerHtml;
Upvotes: 1
Reputation: 929
this is an example of working code
var document = new HtmlDocument();
document.LoadHtml("<div class=\"compTitle options-toggle\"><a class=\" ac-algo fz-l ac-21th lh-24\" href=\"http://www.bestbuy.com\"><b>Huawei</b> Products - Best Buy</a></div>");
var tags = document.DocumentNode.SelectNodes("//div[@class='compTitle options-toggle']//a").ToList();
foreach (var tag in tags)
{
var link = tag.Attributes["href"].Value; // http://www.bestbuy.com
var text = tag.InnerText; // Huawei Products - Best Buy
}
Upvotes: 1