Reputation: 55
I know there's a lot of questions in stack overflow already about using a variable in regular expression, and I managed to make it work if the variable is one word, or if it only needs to match once; however, once I add both a special character/whitespace and a quantifier, I can't get it to match. For example, I want to match whatever is in some_var to any string that contains 3 consecutive copies of it.:
import re
some_var = "what what"
should_match = "what what what what what what hey"
not_a_match = "what what what what hey what what"
match = re.search(re.escape(some_var){3}, should_match)
no_match = re.search(re.escape(some_var){3}, not_a_match)
however the last two lines give me a syntax error, and I've tried
'(.*)'+re.escape(some_var){3}+'(.*)'
('(.*)'+re.escape(some_var)+'(.*)'){3}
'(.*)'+'re.escape(some_var){3}'+'(.*)'
're.escape(some_var){3}'
... I just can't seem to get the syntax for it to match correctly (I keep getting the false conditional). I've tried searching for the answer, but I'm not sure how to get it to recognize the quantifier properly.
Upvotes: 2
Views: 502
Reputation: 104072
Regex patterns are just strings (with any non-alphanumerics backslash escaped to match a literal string), so you can either use format
or %
operator or concatenation to create the pattern string you need.
Given some value of n
as a quantifier, in this case 3, you need to construct the regex string appropriately. The {3}
part needs to be in the pattern string immediately following the re.escape(some_var)
.
You can use the %
operator:
>>> n=3
>>> r'(?:\s*%s){%i}' % (re.escape(some_var), n)
'(?:\\s*what\\ what){3}'
Or, use format
:
>>> r'(?:\s*{0}){{{1}}}'.format(re.escape(some_var), n)
'(?:\\s*what\\ what){3}'
Or use concatenation:
>>> r'(?:\s*'+re.escape(some_var)+'){'+str(n)+'}'
'(?:\\s*what\\ what){3}'
Any of these strings will now work as you think:
>>> re.match(r'(?:\s*%s){%i}' % (re.escape(some_var), n), should_match)
<_sre.SRE_Match object at 0x104244b28>
>>> re.match(r'(?:\s*%s){%i}' % (re.escape(some_var), n), not_a_match)
>>>
Upvotes: 1
Reputation: 627292
You need to group that several words and add optional whitespace:
match = re.search(r"(?:\s*{0}){{3}}".format(re.escape(some_var)), should_match)
See IDEONE demo
The regex will look like (?:\s*what\ what){3}
, and this is how it works: it matches 3 sequences of
\s*
- 0 or more whitespace followed bywhat\ what
- literal what what
substring.Upvotes: 2