jakebird451
jakebird451

Reputation: 2348

Confusion with re.sub

I have the string aa{{{a {{ {aaa{ that I would like to translate to aa { { {a { { {aaa {. Basically every { must a space character before it.

My regular expression substitution function I am currently using is: re.sub(r'[^\ ]{', lambda x:x.group(0)[0]+' {', test_case) The result from the function is: aa {{ {a { { {aaa { (Close, but there is a {{ in the string)

My method performs very well on section like a{a{a. However if two { characters are together like a{{a it only seems to operate on the first { and completely neglect the following {.

A more clear example will be a large series of {{{{{{{{{{{{. My regex substitution returns:{ {{ {{ {{ {{ {{ {. Which clearly skips over every other character given tightly nested {.

Why are they skipping? Any help to untangle this confusion would be greatly appreciated!

P.S. I am sorry to everyone out there that have the strong desire to close all the opened curly-brace.

Upvotes: 1

Views: 192

Answers (3)

mgilson
mgilson

Reputation: 309841

I'd use a negative lookbehind:

re.sub(r'(?<!\s)(\{)',r' \1','{{{{{{')

Basically we parse the string until we hit a {. If the character before it isn't whitespace (that's the (?<!\s) bit), the { matches and we replace it with a space in front.

Upvotes: 4

Amadan
Amadan

Reputation: 198324

They are skipping because your regular expression is consuming two characters: [^\ ] and {. You need to use 0-width negative lookbehind for the preceding space in order not to consume it: (?!<\s){. Then you can just replace it with " {", without the lambda hassle.

Upvotes: 2

Hyperboreus
Hyperboreus

Reputation: 32429

I hope this will do the trick:

re.sub (' *{', ' {', test_case)

Upvotes: 1

Related Questions