Acquiring node in nodejs + xpath

Question

I have an element on a webpage which gives the following XPath source via Chrome Inspector //*[@id="page-wrapper"]/div/table/tbody/tr/td/table/tbody/tr/td[2]/table/tbody/tr[3]/td/table[2]/tbody/tr[2]/td[2]/a

I want to get this node programatically in Node.js.

var parser = new parse5.Parser();
var document = parser.parse(data);
var xhtmldoc = xmlserializer.serializeToString(document);
var xdom = new xmldomparser().parseFromString(xhtmldoc);
var selector = xpath.useNamespaces({"doc": "http://www.w3.org/1999/xhtml"});
var node = selector('//*[@id="page-wrapper"]/div/table/tbody/tr/td/table/tbody/tr/td[2]/table/tbody/tr[3]/td/table[2]/tbody/tr[2]/td[2]/a', xdom);
console.log(node);

But it consistently returns an empty object with any variation of xpath. Is it possible to achieve this?

Thanks.

Mathias M&#252;ller · Accepted Answer

It seems to that you are declaring the correct namespace and a prefix:

 var selector = xpath.useNamespaces({"doc": "http://www.w3.org/1999/xhtml"});

but then you do not use it in the path expression. Prefix elements with doc: in your path expression:

var node = selector('//*[@id="page-wrapper"]/doc:div/doc:table/doc:tbody/doc:tr/doc:td/doc:table/doc:tbody/doc:tr/doc:td[2]/doc:table/doc:tbody/doc:tr[3]/doc:td/doc:table[2]/doc:tbody/doc:tr[2]/doc:td[2]/doc:a', xdom);

That said, the XPath expression you got back from Chrome Inspector is not really handy, and only relies on positions of nodes. If you explain what you are trying to find in that document (and show the document, of course), people could suggest an alternative expression.

Acquiring node in nodejs + xpath

Answers (1)

Related Questions