Reputation: 111109

Python and "re"

A tutorial I have on Regex in python explains how to use the re module in python, I wanted to grab the URL out of an A tag so knowing Regex I wrote the correct expression and tested it in my regex testing app of choice and ensured it worked. When placed into python it failed:

result = re.match("a_regex_of_pure_awesomeness", "a string containing the awesomeness")
# result is None`

After much head scratching I found out the issue, it automatically expects your pattern to be at the start of the string. I have found a fix but I would like to know how to change:

regex = ".*(a_regex_of_pure_awesomeness)"

into

regex = "a_regex_of_pure_awesomeness"

Okay, it's a standard URL regex but I wanted to avoid any potential confusion about what I wanted to get rid of and possibly pretend to be funny.

Upvotes: 7

Answers (4)

Stuart Fehr

Reputation:

Are you using the re.match() or re.search() method? My understanding is that re.match() assumes a "^" at the beginning of your expression and will only search at the beginning of the text, while re.search() acts more like the Perl regular expressions and will only match the beginning of the text if you include a "^" at the beginning of your expression. Hope that helps.

Upvotes: 1

jfs

Reputation: 414905

from BeautifulSoup import BeautifulSoup 

soup = BeautifulSoup(your_html)
for a in soup.findAll('a', href=True):
    # do something with `a` w/ href attribute
    print a['href']

Upvotes: 4

Aaron Maenpaa

Reputation: 123030

>>> import re
>>> pattern = re.compile("url")
>>> string = "   url"
>>> pattern.match(string)
>>> pattern.search(string)
<_sre.SRE_Match object at 0xb7f7a6e8>

Upvotes: 3

zweiterlinde

Reputation: 14779

In Python, there's a distinction between "match" and "search"; match only looks for the pattern at the start of the string, and search looks for the pattern starting at any location within the string.

Python regex docs
Matching vs searching

Upvotes: 20

Python and &quot;re&quot;

Answers (4)

Related Questions

Python and "re"