Reputation: 899
I am working in C# .net Core.
Which library/nuget package can I use in C# to extract my data?
I want:
extractedData = xpathLib.Extract(htmlContent, xpath)
I do not want to use a technique which load a html browser process (like selenium driver opening chrome) since I have to extract 10 000 of webpages per day.
regards. ps: i have seen microsoft provides xpath lib, but it targets only xml.
Upvotes: 3
Views: 1716
Reputation: 267
You can use HTML Agility Pack
This nuget works with XPATH, XDocument and LINQ. And easy to use.
Here is an example from HTML Agility Pack:
var url = "http://html-agility-pack.net/";
var web = new HtmlWeb();
var doc = web.Load(url);
var value = doc.DocumentNode.SelectNodes("//td/input");
Upvotes: 3