Reputation: 1452
I have a little problem with the following code.
import re
pattern = re.compile(r"((?:^|[^\\@]|\\.)+)@")
for text in [
r"ok@\@.py",
r"ok@\\@.py",
r"ok@\\\@.py",
r"ok@\\\\@.py",
r"ok@\\\\\@.py",
]:
search = re.search(pattern, text)
print('---', text, sep="\n")
if search:
print(pattern.sub(r"\1<star>", text))
else:
print('<< NOTHING FOUND ! >>')
This prints :
---
ok@\@.py
ok<star>\@.py
---
ok@\\@.py
ok<star>\\<star>.py
---
ok@\\\@.py
ok<star>\\\<star>.py
---
ok@\\\\@.py
ok<star>\\\\<star>.py
---
ok@\\\\\@.py
ok<star>\\\\\<star>.py
The problem starts with the 3rd output that is wrong because there is first an escaped backslash and then the escaped character @. The problem continues with more backslashes : just see the last output with two escaped backslashes and then the escaped character @..
Here is the expected output where the @ is indeed escaped only when there is an odd number of \ before it.
---
ok@\@.py
ok<star>\@.py
---
ok@\\@.py
ok<star>\\<star>.py
---
ok@\\\@.py
ok<star>\\\@.py
---
ok@\\\\@.py
ok<star>\\\\<star>.py
---
ok@\\\\\@.py
ok<star>\\\\\@.py
What is wrong in my regex and how to fix it ?
Upvotes: 3
Views: 448
Reputation: 13640
Use the following regex:
pattern = re.compile(r"(?<!\\)((?:\\\\)*)@")
And replace with just <star>
Output:
ok<star>\@.py
ok<star>\\<star>.py
ok<star>\\\@.py
ok<star>\\\\<star>.py
ok<star>\\\\\@.py
See DEMO
Upvotes: 2