Reputation: 3716
I have no experience in regex, I tried dabbling in it a few times, but never stuck to it.
I'm scrapping a site in python
using BeautifulSoup and have come across img
tags that have an id
attribute that can be used to pull my wanted data. But I need a regex to pull all the data matching the id
constraints. The constraints are as follow:
img-%d
: %d is a whole digit ranging from 0 - 255
<img id="img-1" ...>
<img id="img-2" ...>
<img id="img-3" ...>
...
<img id="img-25" ...>
...
<img id="img-255" ...>
In regex how would I write the expression to look for img-%d
.
I know \d
is used to match a single digit but I have 300 possible digits, the [0-9]
doesn't work here.
Code is real simple I'm just missing the regex.
#regex_needed = re.comple( 'expresion here )
soup.find_all('img', attrs={'id': regex_needed})
Upvotes: 0
Views: 90
Reputation: 26667
You can use the regex
img-\d{1,3}
which would match atleast 1 and at most 3 characters
import re
pat=re.compile(r'img-\d{1,3}')
soup.find_all('img', attrs={'id': pat}
Upvotes: 3
Reputation: 6958
If you wanted a more specific regex than nu11p01n73R's that only works with 0-255, try this as your pattern:
\b([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b
Upvotes: 1