Reputation: 37
I'm following an RSS feed, which returns an XML. Inside the XML are HTML tables, returned as one long string. I'm trying to access the elements of this HTML table with C#, so that I may use each of these elements as variables for another program. An example of a table:
<table cellpadding="5"><tr><td><strong>Date (GMT)</strong></td><td><strong>Event</strong></td><td><strong>Cons.</strong></td><td><strong>Actual</strong></td><td><strong>Previous</strong></td></tr><tr><td>Jun 7 11:00</td><td>Announcement</td><td>6.250 %</td><td>6.310 %</td><td>6.560 %</td></tr></table>
Just about every similar thread on here has suggested HtmlAgilityPack, which I'm trying to use. So far, I've been able to pull out the HTML table and declare it as a string variable, but I can't seem to be able to pull out the table elements. The following is my hack, based on several users' suggestions:
XmlDocument xDoc = new XmlDocument();
xDoc.Load("http://rssfeed.com");
string descr = xDoc.SelectSingleNode("rss/channel/item/description").InnerText;
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml("descr");
// A Print statement here (textBox1.Text = descr;) shows that I'm successfully accessing the HTML table
var table = doc.DocumentNode.Descendants("tr")
.Select(n => n.Elements("td").Select(o => o.InnerText).ToArray());
foreach (var tr in table)
{
textBox1.Text = String.Format("{0} {1} {2}", tr[0], tr[1], tr[2]);
}
Any and all suggestions are extremely welcome.
Thanks, D
Upvotes: 3
Views: 5950
Reputation: 11945
This worked for me, and as long as the Html works as Xml it will for you (and the values are always within a TD). The Value of a TD with a single element inside (aka the strong's) is the same as that element's value.
XElement table = XElement.Parse("<table cellpadding=\"5\"><tr><td><strong>Date (GMT)</strong></td><td><strong>Event</strong></td><td><strong>Cons.</strong></td><td><strong>Actual</strong></td><td><strong>Previous</strong></td></tr><tr><td>Jun 7 11:00</td><td>Announcement</td><td>6.250 %</td><td>6.310 %</td><td>6.560 %</td></tr></table>");
string[] values = table.Descendants("td").Select(td => td.Value).ToArray();
And/or the rows with value arrays:
var rows = table.Elements()
.Select(tr => tr.Elements().Select(td => td.Value).ToArray())
.ToList();
Update:
foreach (string value in values)
Console.WriteLine(value);
foreach (string[] row in rows)
foreach (string value in row)
Console.WriteLine(value);
Upvotes: 2