Brandon Nadeau
Brandon Nadeau

Reputation: 3716

Regex: Simple Match

I have no experience in regex, I tried dabbling in it a few times, but never stuck to it.

I'm scrapping a site in python using BeautifulSoup and have come across img tags that have an id attribute that can be used to pull my wanted data. But I need a regex to pull all the data matching the id constraints. The constraints are as follow:

img-%d: %d is a whole digit ranging from 0 - 255

<img id="img-1" ...> <img id="img-2" ...> <img id="img-3" ...> ... <img id="img-25" ...> ... <img id="img-255" ...>

In regex how would I write the expression to look for img-%d. I know \d is used to match a single digit but I have 300 possible digits, the [0-9] doesn't work here.

Code is real simple I'm just missing the regex.

#regex_needed = re.comple( 'expresion here )
soup.find_all('img', attrs={'id': regex_needed})

Upvotes: 0

Views: 90

Answers (2)

nu11p01n73R
nu11p01n73R

Reputation: 26667

You can use the regex

img-\d{1,3}

which would match atleast 1 and at most 3 characters

import re

pat=re.compile(r'img-\d{1,3}')

soup.find_all('img', attrs={'id': pat}

Upvotes: 3

kayleeFrye_onDeck
kayleeFrye_onDeck

Reputation: 6958

If you wanted a more specific regex than nu11p01n73R's that only works with 0-255, try this as your pattern:

\b([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b

Source

Upvotes: 1

Related Questions