Peter Graham
Peter Graham

Reputation: 11701

How to substitute into a regular expression group in Python

>>> s = 'foo: "apples", bar: "oranges"'
>>> pattern = 'foo: "(.*)"'

I want to be able to substitute into the group like this:

>>> re.sub(pattern, 'pears', s, group=1)
'foo: "pears", bar: "oranges"'

Is there a nice way to do this?

Upvotes: 7

Views: 7975

Answers (2)

Michał Niklas
Michał Niklas

Reputation: 54302

For me works something like:

rx = re.compile(r'(foo: ")(.*?)(".*)')
s_new = rx.sub(r'\g<1>pears\g<3>', s)
print(s_new)

Notice ?in re, so it ends with first ", also notice " in groups 1 and 3 because they must be in output.

Instead of \g<1> (or \g<number>) you can use just \1, but remember to use "raw" strings and that g<1> form is preffered because \1 could be ambiguous (look for examples in Python doc) .

Upvotes: 10

Amarghosh
Amarghosh

Reputation: 59451

re.sub(r'(?<=foo: ")[^"]+(?=")', 'pears', s)

The regex matches a sequence of chars that

  • Follows the string foo: ",
  • doesn't contain double quotation marks and
  • is followed by "

(?<=) and (?=) are lookbehind and lookahead

This regex will fail if the value of foo contains escaped quots. Use the following one to catch them too:

re.sub(r'(?<=foo: ")(\\"|[^"])+(?=")', 'pears', s)

Sample code

>>> s = 'foo: "apples \\\"and\\\" more apples", bar: "oranges"'
>>> print s
foo: "apples \"and\" more apples", bar: "oranges"
>>> print   re.sub(r'(?<=foo: ")(\\"|[^"])+(?=")', 'pears', s)
foo: "pears", bar: "oranges"

Upvotes: 0

Related Questions