Ozzah
Ozzah

Reputation: 10711

C# Searching through HTML

I have written a few programs over the last few months that load HTML pages into a string and does various things like extract bits and pieces. I was basically writing my own GUI for some websites which have no API.

I've done this by stringing together many String.Substring(), String.IndexOf(), and String.LastIndexOf() statements.

I realise this is probably not the best way to do it - I was just writing a few "quick-and-dirty" trials to begin with.

What is the proper way to extract tokens from a web page? Thanks :)

Upvotes: 2

Views: 130

Answers (2)

Marc Gravell
Marc Gravell

Reputation: 1064234

For XHTML, load it into XmlDocument or XDoxument.

For (non-X)HTML, load it into the HTML Agility Pack's HtmlDocument - the API is almost the same as XmlDocument, so it should be familiar.

Upvotes: 3

Govind Malviya
Govind Malviya

Reputation: 13763

Use Html Agility Pack

Upvotes: 3

Related Questions