Reputation: 2402
While running code it says that ProductListPage
is null and after dropping an error does not proceed forward.
Any ideas how to solve this issue? Wait until //div[@class='productContain padb6']//div[@class='large-4 medium-4 columns']/a
is found or something else?
Here is my current code:
HtmlDocument htmlDoc = new HtmlWeb().Load("https://example.com/");
HtmlNodeCollection ProductListPage = htmlDoc.DocumentNode.SelectNodes("//div[@class='productContain padb6']//div[@class='large-4 medium-4 columns']/a");
foreach (HtmlNode src in ProductListPage)
{
htmlDoc = new HtmlWeb().Load(src.Attributes["href"].Value);
HtmlNodeCollection LinkTester = htmlDoc.DocumentNode.SelectNodes("//div[@class='row padt6 padb4']//a");
if (LinkTester != null)
{
foreach (var dllink in LinkTester)
{
string LinkURL = dllink.Attributes["href"].Value;
Console.WriteLine(LinkURL);
string ExtractFilename = LinkURL.Substring(LinkURL.LastIndexOf("/"));
var DLClient = new WebClient();
DLClient.DownloadFileAsync(new Uri(LinkURL), @"C:\temp\" + ExtractFilename);
}
}
}
EDIT:
Code seems to work without VPN connection, however it does not work with VPN. I have alternative made using Python and BeautifulSoup and it works regardless of VPN connection. Is there any idea why C# and htmlAgilityPack does not do the trick?
EDIT2:
I have noticed that on VPN connection page is loaded with a slight delay. First page is getting loaded and then comes the content.
Upvotes: 0
Views: 612
Reputation: 2402
After about 2 months of searching and reading finally there is a solution. Adding this to app.config
worked for me without the need for any code changes:
<system.net>
<defaultProxy useDefaultCredentials="true" />
</system.net>
so my app.config
looks like this now:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<startup>
<supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.7.2" />
</startup>
<system.net>
<defaultProxy useDefaultCredentials="true" />
</system.net>
</configuration>
Please give original answer credits for this! https://stackoverflow.com/a/40900485/7202022
Upvotes: 1
Reputation: 11364
Make sure you have access to the site (firewall or other app not allowing access perhaps).
When i ran your code, both Visual Basic and .Net, I can get to the subsites and even look up the Pdf links. I would recommend using the debugger to
htmlDoc.DocumentNode
Upvotes: 1