Reputation: 169
Using htmlagilityPack trying to get all href links. But web page doesn't return all links.
I tried in browser and saw that until you scroll down the whole page it doesn't show all links. Then I tried to resize (zoom-in) browser window so that all page contents can be seen without scrolling down. That moment all links appeared. May be java need to triggered....
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument Doc = web.Load("https://www.verkkokauppa.com/fi/catalog/438b/Televisiot/products?page=1");
foreach (HtmlNode item in Doc.DocumentNode.SelectNodes("//li[@class='product-list-grid__grid-item']/a"))
{
debug.WriteLine(item.GetAttributeValue("href", string.Empty));
}
One page has 24 product links but I get only 15 out of them.
Upvotes: 0
Views: 1012
Reputation: 5808
Check Network tab in chrome on that page. There are ajax requests to https://www.verkkokauppa.com/resp-api/product?pids=467610
. So products are loaded using javascript.
You can't just trigger javascript here. HtmlAgilityPack is an html parser. If you want to work with dynamic content you need browser engine. I think you should check Selenium and phantomjs.
Upvotes: 1