rachelvsamuel
rachelvsamuel

Reputation: 1711

Python: re.findall won't find a string in html

re.findall won't find a string in html. Here is my code:

def get_id(html_source):
    the_button = re.findall("preview.aspx?id=1692003076", html_source)
    print(the_button)

When I print(html_source) I get the html, which by sight contains "preview.aspx?id=1692003076". re.search also failed to find the string.

I have another re.findall in my code, and it works fine:

id_matches = re.findall('<input type="checkbox" id="\d+"', html_source)

Any idea why it doesn't work?

Upvotes: 0

Views: 262

Answers (2)

ZeroQ
ZeroQ

Reputation: 109

Note that the "?" is a special character in regular expressions. You need to escape it.

Upvotes: 1

Ben
Ben

Reputation: 6348

Try escaping your special characters in your regex: ., ?. Or, use html_source.find("preview.aspx?id=1692003076") to find the first instance of that specific string.

If that doesn't work, post a sample of the HTML in your question so we can reproduce this problem.

Upvotes: 0

Related Questions