Reputation: 212
For example:
My string is: 123456789 nn nn oo nn nn mlm nn203
.
My target is: nn
.
Then, I match string from the end to the beginning and return the first match result and its postion.
In this examlpe, the result is nn
start in [-5] end in [-3].
I wrote the simple funcitonto do this process, but how to use regular expressions to do this job?
Upvotes: 18
Views: 37376
Reputation: 1
Please note that every trivial solution that does not reverse the string and pattern will fail in the edge case of overlapping patterns. Let's say you have the string "01-02-03" and you want to find the last match of "\d{2}-\d{2}". Using findall/finditer with [-1] will NOT return "02-03" but "01-02".
(To get a solution that does not reverse the string, all overlapping patterns must be in the results, then the [-1]-method does work.)
(As I do not have enough reputation to leave a comment, I have to leave an answer instead.)
Upvotes: 0
Reputation: 2889
In Python the answer is rfind it works not only on strings, but on bytes too!
Regexp has also two special symbols: ^ and $.
tx = "hello .... ok"
# ^ forces the search from beginning
/^.*[o].*/ will find in hello
# $ forces the search at the END
/.*[o].*$/ will match from ok
Hope it helps!
Upvotes: 0
Reputation: 103754
For the string itself, just do a findall and use the last one:
import re
st='123456 nn1 nn2 nn3 nn4 mlm nn5 mlm'
print(re.findall(r'(nn\d+)',st)[-1])
Prints nn5
You can also do the same thing using finditer
which makes it easier finding the relevant indexes:
print([(m.group(),m.start(),m.end()) for m in re.finditer(r'(nn\d+)',st)][-1])
Prints ('nn5', 27, 30)
If you have a lot of matches and you only want the last, sometimes it makes sense to simply reverse the string and pattern:
m=re.search(r'(\d+nn)',st[::-1])
offset=m.start(1)
print(st[-m.start(1)-len(m.group(1)):-m.start(1)])
Or, modify your pattern into something that only the last match could possible satisfy:
# since fixed width, you can use a lookbehind:
m=re.search(r'(...(?<=nn\d)(?!.*nn\d))',st)
if m: print(m.group(1))
Or, take advantage of the greediness of .*
which will always return the last of multiple matches:
# .* will skip to the last match of nn\d
m=re.search(r'.*(nn\d)', st)
if m: print(m.group(1))
Any of those prints nn5
Upvotes: 24
Reputation: 4392
First, if you're not looking for a regular expression, string.rfind
is a lot easier to get right.
You can use a regular expression by using a negative lookahead, see the documentation of re:
import re
s = "123456789 nn nn oo nn nn mlm nn203"
match = re.search("(nn)(?!.*nn.*)", s)
# for your negative numbers:
print (match.start()-len(s), match.end()-len(s))
# (-5, -3)
Upvotes: 7
Reputation: 16361
Idea:
Example:
>>> import re
>>> s = "123456789 nn nn oo nn nn mlm nn203"
>>> m = re.search("(nn)", s[::-1])
>>> -m.end(), -m.start()
(-5, -3)
Upvotes: 5