Thierry Brémard
Thierry Brémard

Reputation: 899

Locate XPath content of HTML in C#

I am working in C# .net Core.

Which library/nuget package can I use in C# to extract my data?

I want:

extractedData = xpathLib.Extract(htmlContent, xpath)

I do not want to use a technique which load a html browser process (like selenium driver opening chrome) since I have to extract 10 000 of webpages per day.

regards. ps: i have seen microsoft provides xpath lib, but it targets only xml.

Upvotes: 3

Views: 1716

Answers (1)

Göksel ÖZER
Göksel ÖZER

Reputation: 267

You can use HTML Agility Pack

This nuget works with XPATH, XDocument and LINQ. And easy to use.

Here is an example from HTML Agility Pack:

var url = "http://html-agility-pack.net/";
var web = new HtmlWeb();
var doc = web.Load(url);
var value = doc.DocumentNode.SelectNodes("//td/input");

Upvotes: 3

Related Questions