Reputation: 2331
Before you stop reading and suggest HTML Agility (based on the title), I am already using this tool. The problem is this: I have have a webpage that lists a whole bunch of case numbers and has links to the individual case number page. My app already downloads this info and displays it in a datagridview
. However in my app I also need information from the individual case number pages (the links).
The problem is I already know it's going to take forever to acquire using HTML agility. To get the case page, it takes about 2 minutes. Code wise I'm feeding HTML agility the HTML code, adding the cell values to an array and parsing out the array indexes I to display in my grid. This is a very large array parse for the number of components on the page.
Any ideas to acquire the main page and specific cells from the linked pages?
Upvotes: 0
Views: 95
Reputation: 11201
An example showing how you can use XPath in HmtlAgility
HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(yourHtml);
Example 1 : //The below example will get all div's with class as container foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@class='container']")) {
Console.Writeline(node.InnerText); }Example 2 : //The below example will get first div with class as container HtmlNode node in doc.DocumentNode.SelectNodes("//div[@class='container'][1]"))
Console.Writeline(node.InnerText);
You can use Xpath Queries to get the element(s) you want
for XPath syntax and more please use the link http://www.w3schools.com/xpath/xpath_syntax.asp
Upvotes: 1