Ben Ford
Ben Ford

Reputation: 1394

Intelligent web scraping c#

Theres a number of products out there that provide a gui to pick out the tags you want to scrape from a web page. (Things like WebHarvy for example)

I've seen the HTML Agility Pack before for getting at the DOM. I just wanted to check if anyone knows of any nice libraries or processes for automatically finding the useful content within a HTML page and creating the XPath required.

Similar to how Evernote and iOS know where the "Article" is on a page. However ideally working for repeating regions and pagination.

Upvotes: 0

Views: 901

Answers (1)

Remy
Remy

Reputation: 12713

Not sure if this is what you are looking for:
http://www.diffbot.com/

But Diffbot is good in scraping content from websites.

Upvotes: 1

Related Questions