Vlad T.
Vlad T.

Reputation: 2608

regex + Python: How to find string with '?' in it?

I have a multi-line string in content variable, and I need to retreive all matches for a pattern uri containing question mark in it.

This is what I have so far:

content = """
/blog:text:Lorem ipsum dolor sit amet, consectetur adipisicing elit
<break>
text:Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
<break>
text:Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia.

/blog?page=1:text:Lorem ipsum dolor sit amet, consectetur adipisicing elit
<break>
text:Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
<break>
text:Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia.
"""

#uri = '/blog' # Works fine
uri = '/blog?page=1'
re.findall('^(?ism)%s?:(.*?)(\n\n)' % uri, content)

It works fine until uri gets ? with parameters after it, and I get empty list.

Any ideas how to fix the regex?

Upvotes: 0

Views: 47

Answers (3)

Jon Clements
Jon Clements

Reputation: 142156

I'd keep it simple and find possible matches, then filter out those containing a ?, eg:

import re

candidates = (m.group(1) for m in re.finditer('^(.*?):', content, flags=re.M))
matches = [m for m in candidates if '?' in m]
# ['/blog?page=1']

Upvotes: 1

Charles Duffy
Charles Duffy

Reputation: 295443

Python's re.escape() is your friend. If you don't use it, the ? inside the uri is treated with its usual meaning inside of a regular expression (making the prior item a 0-or-1 match).

uri = '/blog?page=1'
re.findall('^(?ism)%s?:(.*?)(\n\n)' % re.escape(uri), content)

I'm not clear exactly what you want the ?: after the the %s to do, so I'm leaving it in on the potentially-faulty presumption that it's there for a reason.

Upvotes: 1

Sabuj Hassan
Sabuj Hassan

Reputation: 39365

I didn't see two newlines in your content. Also, I have escaped the ? from uri as it's regex character.

uri = '/blog\?page=1'
re.findall('^(?ism)%s?:(.*?)[\n\r]' % uri, content)

Upvotes: 0

Related Questions