Charlie Morrison
Charlie Morrison

Reputation: 41

speed up parsing in html agility pack

This is a method I use to grab certain tags with the html agility pack. I use this method to do rankings with google local. It seems to take quite a bit of time and be memory intensive, does anyone have any suggestions to make it better?

 private void findGoogleLocal(HtmlNode node) {

     String   name        = String.Empty;
     // 
     // ----------------------------------------
     if (node.Attributes["id"] != null) {

       if (node.Attributes["id"].Value.ToString().Contains("panel_") &&   node.Attributes["id"].Value.ToString() != "panel__")
        {
        GoogleLocalResults.Add(new Result(URLGoogleLocal, Listing, node, SearchEngine.Google, SearchType.Local, ResultType.GooglePlaces));
        }
    }

    if (node.HasChildNodes) {
      foreach (HtmlNode children in node.ChildNodes)  {
        findGoogleLocal(children);
      }
    }

  }

Upvotes: 0

Views: 1598

Answers (3)

Skiminok
Skiminok

Reputation: 2871

I just want to add another clean, simple and fast solution: using XPath.

var results = node
                .SelectNodes(@"//*[contains(@id, 'panel_') and @id != 'panel__']")
                .Select(x => new Result(URLGoogleLocal, Listing, x, SearchEngine.Google, SearchType.Local, ResultType.GooglePlaces));
foreach (var result in results)
    GoogleLocalResults.Add(result);

Upvotes: 2

BrokenGlass
BrokenGlass

Reputation: 160922

Why does this method have to be recursive? Just get all the nodes in one go (example using the Linq support in HAP):

var results = node.Descendants()
                  .Where(x=> x.Attributes["id"]!= null && 
                             x.Attributes["id"].Value.Contains("panel_") &&  
                             x.Attributes["id"].Value!= "panel__")
                  .Select( x=> new Result(URLGoogleLocal, Listing, x, SearchEngine.Google, SearchType.Local, ResultType.GooglePlaces));

Upvotes: 2

Jamie Treworgy
Jamie Treworgy

Reputation: 24344

Fizzler: CSS selector engine for HAP

http://code.google.com/p/fizzler/

Upvotes: 0

Related Questions