Reputation: 129
I have this snippet
print(re.sub(r'(-script\.pyw|\.exe)?', '','.exe1.exe.exe'))
The output is 1 If i remove ? from the above snippet and run it as
print(re.sub(r'(-script\.pyw|\.exe)', '','.exe1.exe.exe'))
Th output is again same. Although I am using ?, it is getting greedy and replacing all '.exe' with NULL. Is there any workaround to replace only first occurrence?
Upvotes: 1
Views: 48
Reputation: 5308
?
is greedy. So if it can match, It will.
For example: aaab?
will match aaab
instead of aaa
In order to make ?
non greedy, you must add an extra ?
(this is the same way you make *
and +
non greedy, by the way)
So aaab??
will just match aaa
. Yet, at the same time, aaab??c
will match aaabc
Upvotes: 0
Reputation: 2958
Question mark is making the preceding token in the regular expression optional Use
print(re.sub(r'(-script\.pyw|\.exe)', '','.exe1.exe.exe', 1))
if you want to remove only the first match.
Upvotes: 0
Reputation: 198324
re.sub(pattern, repl, string, count=0, flags=0)
This is the signature for the re.sub
function. Notice the count
parameter. If you just want the first occurence to be replaced, use count=1
.
?
is a non-greedy modifier for repetition operators; when it stands next to anything else, it makes the previous element optional. Thus, Your top expression is replacing either -script.pyw
or .exe
or nothing with nothing. Since replacement of nothing by nothing doesn't change the string, the top and the bottom version (where empty string cannot be matched) will give the same result.
Upvotes: 1