Dumbo
Dumbo

Reputation: 14132

Searching in HTML file using C# where many similar tags exist

Imagin the part of HTML file below:

<div class='span1 league'>
    <div class='league-gold-1 leagues size-64'></div>
</div>
<div class='span4 stats'>
    <div class='points'>
        <span class="gold">491</span>
        points
        (<span class="gold">391</span> away for region #1)
    </div>
    <div class='games'>
        Won <span class="text-success">37</span>,
        lost <span class="text-error">51</span>,
        ratio <span>42.05</span>%
    </div>
    <div class='race'>
        Favorite Race:
        <div class='race-terran races size-16'></div>
        <span>Terran</span>
    </div>
</div>

Say I need to get number of Won and Lost games which are 37 and 51 in this case. Also the points (in this case 491). I've been trying with html agility pack but no success so far. If you now a way around this please let me know!

Upvotes: 0

Views: 154

Answers (2)

VladL
VladL

Reputation: 13043

As a workaround you could try regex

 Match m = Regex.Match(htmlstring, "<span class=\"text-success\">([0-9]+?)</span>.*?<span class=\"text-error\">([0-9]+?)</span>", RegexOptions.Singleline);
 string won = m.Result("$1");
 string loss = m.Result("$2");

Upvotes: 0

I4V
I4V

Reputation: 35373

Using HtmlAgilityPack

var doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(fname);
var won  = doc.DocumentNode.SelectSingleNode("//div[@class='games']/*[@class='text-success']").InnerText;
var lost = doc.DocumentNode.SelectSingleNode("//div[@class='games']/*[@class='text-error']").InnerText;
var points = doc.DocumentNode.SelectSingleNode("//div[@class='points']/*[@class='gold']").InnerText;

You can also use Linq instead of XPath

var won = doc.DocumentNode.Descendants("span")
          .First(s=>s.Attributes.Any(a=>a.Value=="text-success"))
          .InnerText;

Upvotes: 1

Related Questions