Reputation: 405
I'm trying to get the information of all divs with the class="top-tournament " using HtmlAgilityPack in c#
The problem is that nodes variable is always empty, it means that I'm not doing it in the proper way
HTML example
With this code
class Program
{
static void Main(string[] args)
{
startCrawlerAsync().Wait();
}
private static async Task startCrawlerAsync()
{
var url = "https://live.soccerstreams.net/home";
var httpClient = new HttpClient();
var html = await httpClient.GetStringAsync(url);
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
HtmlNodeCollection nodes = htmlDocument.DocumentNode.SelectNodes("//div[@class=\"top-tournament \"]");
}
}
Upvotes: 1
Views: 414
Reputation: 805
If you look at htmlDocument.ParsedText
you will see that the above website returns JavaScript as part of it's body. JavaScript then executes in your browser and builds the HTML you see. HtmlAgilityPack can't execute JavaScript to build html, so therefore you are getting null
for nodes
If you want to use C# for the above task I would recommend looking at the following question: Scraping webpage generated by javascript with C#
Upvotes: 2