Gaurav Joseph
Gaurav Joseph

Reputation: 946

Acquiring node in nodejs + xpath

I have an element on a webpage which gives the following XPath source via Chrome Inspector //*[@id="page-wrapper"]/div/table/tbody/tr/td/table/tbody/tr/td[2]/table/tbody/tr[3]/td/table[2]/tbody/tr[2]/td[2]/a

I want to get this node programatically in Node.js.

var parser = new parse5.Parser();
var document = parser.parse(data);
var xhtmldoc = xmlserializer.serializeToString(document);
var xdom = new xmldomparser().parseFromString(xhtmldoc);
var selector = xpath.useNamespaces({"doc": "http://www.w3.org/1999/xhtml"});
var node = selector('//*[@id="page-wrapper"]/div/table/tbody/tr/td/table/tbody/tr/td[2]/table/tbody/tr[3]/td/table[2]/tbody/tr[2]/td[2]/a', xdom);
console.log(node);

But it consistently returns an empty object with any variation of xpath. Is it possible to achieve this?

Thanks.

Upvotes: 0

Views: 486

Answers (1)

Mathias Müller
Mathias Müller

Reputation: 22647

It seems to that you are declaring the correct namespace and a prefix:

 var selector = xpath.useNamespaces({"doc": "http://www.w3.org/1999/xhtml"});

but then you do not use it in the path expression. Prefix elements with doc: in your path expression:

var node = selector('//*[@id="page-wrapper"]/doc:div/doc:table/doc:tbody/doc:tr/doc:td/doc:table/doc:tbody/doc:tr/doc:td[2]/doc:table/doc:tbody/doc:tr[3]/doc:td/doc:table[2]/doc:tbody/doc:tr[2]/doc:td[2]/doc:a', xdom);

That said, the XPath expression you got back from Chrome Inspector is not really handy, and only relies on positions of nodes. If you explain what you are trying to find in that document (and show the document, of course), people could suggest an alternative expression.

Upvotes: 1

Related Questions