Reputation: 31
I want to extract data from HTMLTable row by row. But I'm facing problems in separating columns in the rows. The code I'm using below gives me each cell in a single line. But I want each row in 1 line then another. how can I do that?
HtmlNode table = doc.DocumentNode.SelectSingleNode("//table[" + tableCounter + "]");
foreach (var cell in table.SelectNodes(".//tr/td"))
{
string someVariable = cell.InnerText;
ReportFileWriter(someVariable);
}
tableCounter++;
This is the output I get from this code:
and the original table is like this:
and the output I want is to have spaces between columns:
Upvotes: 2
Views: 2297
Reputation: 5986
Since I don't know your specific website, I used the following code to parse the
html table.
You need install Nuget -> HtmlAgilityPack. Code:
WebClient webClient = new WebClient();
string page = webClient.DownloadString("http://www.mufap.com.pk/payout-report.php?tab=01");
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
List<List<string>> table = doc.DocumentNode.SelectSingleNode("//table[@class='mydata']")
.Descendants("tr")
.Skip(1)
.Where(tr => tr.Elements("td").Count() > 1)
.Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
.ToList();
string result = string.Empty;
foreach (var item in table[0])
{
result = result + " " + item;
}
Console.WriteLine(result);
The first row in website:
Upvotes: 2