How to extract a link using xpath

Question

I'm trying to make an application where you input a web url (http://www.explosm.net/comics/3104/) and it automatically saves a string with the first link it finds given the xpath (//*[@id="maincontent"]/div[2]/div[2]/div[1]/img), which is a picture I want to download.

I honestly have no clue where to even begin with this. I've tried the HtmlAgilityPack and the WebBrowser class, but I couldn't find anything to help me understand what to do and how to do it.

Any help will be greatly appreciated.

Oscar Mederos · Accepted Answer

It is pretty easy with HTMLAgilityPack.

var w = new HtmlWeb();
var doc = w.Load("http://www.explosm.net/comics/3104/");

var imgNode = doc.DocumentNode.SelectSingleNode("//*[@id=\"maincontent\"]/div[2]/div[2]/div[1]/img");

var src = imgNode.GetAttributeValue("src", "");

The variable src will have the value http://www.explosm.net/db/files/Comics/Matt/Dont-be-a-dickhead.png.

All you have to do then is download the image:

var request = (HttpWebRequest)WebRequest.Create(src);
var response = request.GetResponse();

var stream = response.GetResponseStream();

//Here you have an Image object
Image img = Image.FromStream(stream);

//And you can save it or do whatever you want
img.Save(@"C:\file.png");

How to extract a link using xpath

Answers (1)

Related Questions