Reputation: 3635
I'm using python 3 and re
module.
Here are the strings:
s1 = "http://52portal/flood-2011-year-39090/gallery?p=3"
s2 = "http://52portal/flood-2011-year-39090"
I need to get 39090
number, ids are always given so that number has a prefix -
and no particular suffix.
I have an implementation when no other numbers are in url:
pattern = r'-([0-9]+)'
re.findall(pattern, s)[0]
How would I tell the program to ignore the number that has suffix and prefix -
?
Upvotes: 0
Views: 3481
Reputation: 626689
You need to find the right-side boundary. It can be /
or end of string.
(?<=-)\d+(?=/|$)
Here, (?<=-)
is a positive look-behind that checks if there is a hyphen before 1 or more digits (\d+
) and (?=/|$)
is a positive look-ahead that makes sure there is /
or end of string right after that sequence.
See demo
Here is sample code:
import re
p = re.compile(r'(?<=-)\d+(?=/|$)')
test_str = "http://52portal/flood-2011-year-39090/gallery?p=3\nhttp://52portal/flood-2011-year-39090"
print(re.findall(p, test_str))
Upvotes: 3
Reputation: 21437
Try as
(?<=-)[0-9]+?(?=/|$)
https://regex101.com/r/sY3qI2/2
Upvotes: 1