Reputation: 11
I'm trying to pull a table off of a website using the HTML Agility Pack. I'm having a problem extracting the column data. Each row should have 6 columns. However when I read the cells it's merging all column data into one result.
I'm getting this: Vintage Buff Banner665c12425
Instead of this:
Vintage Buff Banner
665c
1
24
Blank
25
Code I'm using is below:
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load("http://www.tf2wh.com/backpack?bp=x44rUEmREP-OCT9Kp-9w6n3GOJQJpf43YQD_dp98AvY");
var xpath = "/html/body/div[@class='page']/div[@class='main']/div[@class='specialtrade']/table[@class='data']/tbody/tr[@class='normal']";
var rows = doc.DocumentNode.SelectNodes(xpath);
foreach (HtmlNode row in rows)
{
HtmlNodeCollection cells = row.SelectNodes("th|td");
foreach (HtmlNode cell in cells)
{
Console.WriteLine("cell: " + cell.InnerText);
}
}
Upvotes: 0
Views: 222
Reputation: 11
I figured it out - it was bad HTML. I ran it through Tidy.NET before HTML Agility Pack, and I'm getting the results I want.
Upvotes: 1