Shimmy Weitzhandler
Shimmy Weitzhandler

Reputation: 104692

Html Agility Pack returns invalid XPath

I have an HTML document opened in two windows, and I need the selected node to be synchronized between both windows.

Using Html Agility Pack I tried:

HtmlNode myNode = GetSomeCertainNode();

string xpath = myNode.XPath; //xpath = "/#comment[1]"

// This line throws an XPathException
var reExtract = myNode.OwnerDocument.DocumentNode.SelectSingleNode(xpath);

Exception message: '/#comment[1]' has an invalid token.

I'm wondering, I took the XPath from the node itself, which means it's a proper XPath, and I use it against the same document, why does it fail, what do I miss?

Update

When selecting some other nodes I get this exception instead: Expression must evaluate to a node-set. (xpath contains /html[1]/body[1]/div[1]/p[3]/strong[1]/#text[1]).

But remember that the value is taken from the node itself, therefore it's very weird. How come it's complaining that it's invalid?

Upvotes: 2

Views: 1519

Answers (2)

Shimmy Weitzhandler
Shimmy Weitzhandler

Reputation: 104692

According to Mak Toro's answer I created a workaround function:

private string ValidateXPath(string xpath)
{
  var index = xpath.LastIndexOf("/");
  var lastPath = xpath.Substring(index);

  if (lastPath.Contains("#"))
  {
    xpath = xpath.Substring(0, index);
    lastPath = lastPath.Replace("#", "");
    lastPath = lastPath.Replace("[", "()[");
    xpath = xpath + lastPath;
  }                                

  return xpath;
}

Now it works great.

Upvotes: 1

Max Toro
Max Toro

Reputation: 28608

The # character is illegal in an element name. A valid XPath expression that selects a comment would be /comment()[1]

Upvotes: 2

Related Questions