user3044218
user3044218

Reputation: 11

HTML Agility Pack cells merged

I'm trying to pull a table off of a website using the HTML Agility Pack. I'm having a problem extracting the column data. Each row should have 6 columns. However when I read the cells it's merging all column data into one result.

I'm getting this: Vintage Buff Banner665c12425

Instead of this:

Vintage Buff Banner

665c

1

24

Blank

25

Code I'm using is below:

    HtmlWeb web = new HtmlWeb();
    HtmlDocument doc = web.Load("http://www.tf2wh.com/backpack?bp=x44rUEmREP-OCT9Kp-9w6n3GOJQJpf43YQD_dp98AvY");

    var xpath = "/html/body/div[@class='page']/div[@class='main']/div[@class='specialtrade']/table[@class='data']/tbody/tr[@class='normal']";

    var rows = doc.DocumentNode.SelectNodes(xpath);
    foreach (HtmlNode row in rows)
    {
        HtmlNodeCollection cells = row.SelectNodes("th|td");
        foreach (HtmlNode cell in cells)
        {
            Console.WriteLine("cell: " + cell.InnerText);
        }
    }

Upvotes: 0

Views: 222

Answers (1)

user3044218
user3044218

Reputation: 11

I figured it out - it was bad HTML. I ran it through Tidy.NET before HTML Agility Pack, and I'm getting the results I want.

Upvotes: 1

Related Questions