Bob
Bob

Reputation: 6173

How to parse Xpath expressions in Python?

I need to parse (not to evaluate) Xpath expressions in Python to change them, e.g. I have expressions like

//div[...whatever...]//some-other-node...

and I need to change them to (for example):

/changed-node[@attr='value' and ...whatever...]/another-changed-node[@attr='value' ...

As it seems to me I need to split the original expression to steps and the steps to axes+nodes and predicates. Is there some tool I can do it with or is there a nice and easy way to do it without one?

The catch is I can't be sure that the predicates of original expressions won't contain something like [@id='some/value/with/slashes'] so I can't parse them with naive regexes.

Upvotes: 2

Views: 345

Answers (1)

Michael Kay
Michael Kay

Reputation: 163262

You might be able to use the REx parser generator from Gunther Rademacher. See http://www.bottlecaps.de/rex/ This will generate a parser for any grammar from a suitable BNF, and suitable BNF for various XPath versions is available. REx is a superb piece of technology spoilt only by extremely poor documentation. It can generate parsers in a number of languages including Javascript, XQuery, and XSLT. It's used in the Saxon-JS product to parse dynamic XPath expressions within the browser.

Another approach is to use the XQuery to XQueryX converters available from W3C (XPath is a subset of XQuery so these will handle XPath as well. These produce a representation of the syntax tree in XML).

Upvotes: 3

Related Questions