satwell
satwell

Reputation: 274

pyparsing transform_string with negative lookahead

I'm trying to implement simple shell-style string variable interpolation with $varname syntax, using pyparsing. For example, if I have a variable foo with value "bar", then transform "x $foo x" to "x bar x".

But I'd like to prevent variable interpolation when the $ has a backslash in front. So "x \$foo x" should stay "x \$foo x". I'm using pyparsing's transform_string. I tried to add negative lookahead to avoid parsing a string that starts with \. But it's not working.

import pyparsing as pp

vars = {"foo": "bar"}
interp = pp.Combine(
    ~pp.Literal("\\")
    + pp.Literal("$").suppress()
    + pp.common.identifier.set_parse_action(lambda t: vars[t[0]])
)
print(interp.transform_string("x $foo x"))
print(interp.transform_string("x \\$foo x"))

This outputs:

x bar x
x \bar x

But I'd like it to output:

x bar x
x \$foo x

I suspect negative lookahead at the beginning of a parser doesn't work with transform_string because it can still find a substring without the \ that parses. But I'm not sure how to fix this.

Upvotes: 1

Views: 59

Answers (2)

PaulMcG
PaulMcG

Reputation: 63762

Glad you resolved your issue. To answer your original question, I was able to fix it by changing your expression to this (essentially a negative lookbehind):

interp = pp.Combine(
    ~pp.PrecededBy("\\")
    + pp.Literal("$").suppress()
    + pp.common.identifier.set_parse_action(lambda t: vars[t[0]])
)

Upvotes: 1

satwell
satwell

Reputation: 274

After looking at this again, I realized a couple things:

  1. \$ actually needs to be replaced with $ in the output string to get standard string escaping behavior.
  2. Once I modify the parser to consume the \$ and replace it with a $, everything works properly.

I've updated the code to:

import pyparsing as pp

vars = {"foo": "bar"}
interp = pp.Combine(pp.Literal("\\").suppress() + pp.Char(pp.printables)) | pp.Combine(
    pp.Literal("$").suppress()
    + pp.common.identifier.set_parse_action(lambda t: vars[t[0]])
)
print(interp.transform_string("x $foo x"))
print(interp.transform_string("x \$foo x"))

Which gives me the correct output:

x bar x
x $foo x

Upvotes: 1

Related Questions