FlatLander
FlatLander

Reputation: 1757

C# data scraping from websites

HI I am pretty new in C# sphere. Been in php and JavaScript since the beginning of this year. I want to scrap posts and comments from a blog. The site is http://www.somewhereinblog.net

What I want to do is 1. I want to log in using a software 2. Then download the html 3. Then use regular expressions, xpath whatever comes handy to separate the contents of posts and comments

I been searching all over. Understood very little. Though I am quite sure I need to use 'htmlagilitypack'. I dont know how to add a library to c# console or form application. Can someone give me some help? I badly need this. And I am not too into C# just a week. So would be grateful if there is some detailed information. Waiting eagerly.

Thanks in advance brothers.

Upvotes: 1

Views: 2079

Answers (1)

Alberto León
Alberto León

Reputation: 2921

  1. Using Webclient you can login and download
  2. Instead html-agility-pack I like CsQuery because lets you use jQuery syntax inside a string in C# code, so you can download to a string the html, and search and do things in it like with jQuery and HTML page.

Upvotes: 5

Related Questions