Reputation: 3314
I am getting an object reference error when parsing an external html file, I think this is because not all the elements selected have the class name. Here is my code:
foreach (HtmlNode link in doc.DocumentNode.Descendants("li").Where(i => i.Attributes["class"].Value == "name"))
{
string result = link.InnerText.Trim().Replace(" ", "");
Console.WriteLine(result);
}
How do select only the values where I have the class name of "name"?
Here is my html code I'm trying to parse:
<li>
<span class="name">
<a href="/players/joe-bloggs.html">Joe, Bloggs</a>
</span>
<span class="country">
<img src="/img/flags/15x15/USA.gif" alt="USA"/>
United States
</span>
</li>
<li>
<span class="name">
<a href="/players/joe-bloggs.html">Joe, Bloggs</a>
</span>
<span class="country">
<img src="/img/flags/15x15/USA.gif" alt="USA"/>
United States
</span>
</li>
<li>
<span class="name">
<a href="/players/joe-bloggs.html">Joe, Bloggs</a>
</span>
<span class="country">
<img src="/img/flags/15x15/RSA.gif" alt="RSA"/>
South Africa
</span>
</li>
Upvotes: 2
Views: 1206
Reputation: 236208
You should select a
elements instead of li
elements. And its span
element which have class
attribute. I suggest you to use predicates:
var links = doc.DocumentNode.SelectNodes("//li/span[@class='name']/a");
This xpath selects all span
elements which have class
attribute equal to name
, and then selects a
element.
foreach (var a in links)
Console.WriteLine(a.InnerText);
For your sample HTML output is:
Joe, Bloggs
Joe, Bloggs
Joe, Bloggs
Side note - you can use HttpUtility.HtmlDecode(a.InnerText)
to get decoded text (not only
will be replaced).
UPDATE: Parsing players
var players = from p in doc.DocumentNode.SelectNodes("//li")
let name = p.SelectSingleNode("span[@class='name']/a")
let country = p.SelectSingleNode("span[@class='country']")
select new
{
Name = (name == null) ? null :
HttpUtility.HtmlDecode(name.InnerText.Trim()),
Country = (country == null) ? null :
HttpUtility.HtmlDecode(country.InnerText.Trim())
};
Result:
[
{
Name: "Joe, Bloggs",
Country: "United States"
},
{
Name: "Joe, Bloggs",
Country: "United States"
},
{
Name: "Joe, Bloggs",
Country: "South Africa"
}
]
Upvotes: 3