Chris McAtackney
Chris McAtackney

Reputation: 5242

XPath Query Problem using HTML Agility Pack

I'm trying to scrape the price field from this website using the HTML Agility Pack.

My code is as follows;

var web = new HtmlWeb();
var doc = web.Load(String.Format(overClockersURL, componentID));
var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");

I obtained the XPath query by using Firebug's "Copy as XPath" feature.

The problem I'm having is that SelectSingleNode is returning null - it doesn't seem to find the element specified by the query. I'm a bit stumped as to why, but I don't have much experience with XPath, so would appreciate some pointers as to what I've done wrong.

Upvotes: 2

Views: 2085

Answers (2)

Oscar Mederos
Oscar Mederos

Reputation: 29863

When that happens, you should check if the page is being loaded correctly (you said you're through a HTTP Proxy?)

Try writing the content of doc.DocumentNode.OuterHtml to a text file so you can see if the page is being loaded correctly. Maybe you're getting an error page instead of the original page.

Upvotes: 3

Simon Mourier
Simon Mourier

Reputation: 139256

If I run this code:

    var web = new HtmlWeb();
    var doc = web.Load("http://www.overclockers.co.uk/showproduct.php?prodid=GX-033-HS");
    var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");
    Console.WriteLine("price=" + priceContent.InnerHtml);

It outputs:

price=529.99

So it seems to be working. You can also use //span[@id=\"prodprice\"]" which is better as it avoids all non SPAN tags.

Upvotes: 1

Related Questions