John
John

Reputation: 31

Html Agility Pack link and img src extraction

I have pages that use images as links, and I am trying to get the href link as well as the images src. The problem is what I have now is collecting the href's fine, but it is only getting the first img src and just repeating.

HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a[@href]");
foreach (HtmlNode linkNode in linkNodes)
{
HtmlAttribute link = linkNode.Attributes["href"];
HtmlNode imageNode = linkNode.SelectSingleNode("//img");
HtmlAttribute src = imageNode.Attributes["src"];

string imageLink = link.Value;
string imageUrl = src.Value;
}

Can some one tell me whats wrong or another way of doing it? Thanks.

Upvotes: 3

Views: 8173

Answers (1)

tdaines
tdaines

Reputation: 581

Try changing

HtmlNode imageNode = linkNode.SelectSingleNode("//img");

to

HtmlNode imageNode = linkNode.SelectSingleNode(".//img");

Hope this helps.

Upvotes: 2

Related Questions