Reputation: 7000
I have the following test program:
import re
class Test:
def __init__ (self):
self.idFiltering = True
self.aliases = [
('rose', 'jasmin')
]
for s in (
'__rose__',
'rose',
'moon__rose',
'rose__fish',
'moon__rose__jelly__fish',
'moon__rose__rose__rose__fish',
'sun.moon.rose',
'rose.fish',
'rosexfish',
'moon.rose.jelly__fish',
'moon/rose',
'rose/fish',
'moon/rose/jelly__fish',
):
print (s, self.filterId (s))
print ('done')
def filterId (self, qualifiedId):
if not self.idFiltering or (qualifiedId.startswith ('__') and qualifiedId.endswith ('__')):
return qualifiedId
else:
for alias in self.aliases:
pattern = re.compile (rf'((__)|(?=[^./])){alias [0]}((__)|(?=[./$]))')
# Replace twice to deal with overlap
qualifiedId = pattern.sub (alias [1], qualifiedId)
qualifiedId = pattern.sub (alias [1], qualifiedId)
return qualifiedId
test = Test ()
I expect it to produce:
__rose__ __rose__
rose jasmin
moon__rose moon__jasmin
rose__fish jasminfish
moon__rose__jelly__fish moonjasminjelly__fish
moon__rose__rose__rose__fish moonjasminjasminjasminfish
sun.moon.rose sun.moon.jasmin
rose.fish jasmin.fish
rosexfish rosexfish
moon.rose.jelly__fish moon.jasmin.jelly__fish
moon/rose moon/jasmin
rose/fish jasmin/fish
moon/rose/jelly__fish moon/jasmin/jelly__fish
done
But it produces:
__rose__ __rose__
rose rose
moon__rose moon__rose
rose__fish jasminfish
moon__rose__jelly__fish moonjasminjelly__fish
moon__rose__rose__rose__fish moonjasminjasminjasminfish
sun.moon.rose sun.moon.rose
rose.fish jasmin.fish
rosexfish rosexfish
moon.rose.jelly__fish moon.jasmin.jelly__fish
moon/rose moon/rose
rose/fish jasmin/fish
moon/rose/jelly__fish moon/jasmin/jelly__fish
done
In other words, it doesn't replace 'rose' at the end of a word. It seems to ignore the $ in my pattern. What am I doing wrong?
[EDIT after comments of Aran-Fey and Pushpesh Kumar Rajwanshi]
I've changed the regex to:
rf'((__)|(?=[^./])){alias [0]}((__)|(?=[./])|$)'
and it works fine now, so my problem is solved.
I've also tried:
rf'(^|(__)|(?=[./])){alias [0]}((__)|(?=[./])|$)'
but that does not work. Just curious: Why not?
[EDIT2]
As Rarblack pointed out, my solution just worked by sheer luck. With his/her suggestion I think I found the right regex:
rf'(^|(__)|(?<=[./])){alias [0]}((__)|(?=[./])|$)'
It produces the expected output, and this time not by coincidence.
Upvotes: 1
Views: 61
Reputation: 4664
When you put special regex attributes in []
they lose their meaning and act like ordinary characters. That is why [./$]
is not working. Also, putting ^
inside square brackets means not to filter through all attributes inside it: [^./]
.
Upvotes: 2