Reputation: 1147
My regex syntax is not returning the correct results. I have data returned from GitHub using the github3.py library that returns three possible strings when parsing through the patch key of md files (https://developer.github.com/v3/pulls/#list-pull-requests-files). I've read the regex documentation and several threads, but I'm missing something in my syntax.
string1 = '> [HELP.SELECTOR]'
string2 = '-> [HELP.SELECTOR]'
string3 = '+> [HELP.SELECTOR]'
I want to print True for the exact match to string2
or string3
and False if string1
is found. My results are returning False if string2
or string3
is found.
for prs in repo.pull_requests():
search_string_found = 'False'
regex_search_string1 = re.compile(r"^\+>\s\[HELP.SELECTOR\]")
regex_search_string2 = re.compile(r"^->\s\[HELP.SELECTOR\]")
for data in repo.pull_request(prs.number).files():
match_text1 = regex_search_string1.search(data.patch)
match_text2 = regex_search_string2.search(data.patch)
if match_text1 is not None and match_text2 is not None:
search_string_found = 'True'
break
print('HELP.SELECTOR present in file: ', search_string_found)
Upvotes: 1
Views: 69
Reputation: 626853
Since you confirm your strings may be not located at the string start, you need
regex_search_string = re.compile(r"[+-]>\s\[HELP\.SELECTOR\]")
for data in repo.pull_request(prs.number).files():
match_text = regex_search_string.search(data.patch)
if match_text:
search_string_found = 'True'
break
Note:
[+-]
matches either a +
or a -
since it is a character class that matches a single character from a range/set specified inside it+
inside [...]
does not have to be escaped ever-
at the start or end of [...]
does not have to be escapedre.search
returns a match data object or None
, you need to check it first before accessing the text matched/capturedUpvotes: 1
Reputation: 168626
It is easier to maintain one regex string than several. Try this:
import re
strings = [
'> [HELP.SELECTOR]$',
'-> [HELP.SELECTOR]$',
'+> [HELP.SELECTOR]$',
]
for string in strings:
print (bool(re.match(r'[-+]> \[HELP.SELECTOR\]$', string)), string)
Result:
False > [HELP.SELECTOR]
True -> [HELP.SELECTOR]
True +> [HELP.SELECTOR]
Applying that to your problem,
#UNTESTED
for prs in repo.pull_requests():
search_string_found = any(
re.match(r'[-+]> \[HELP.SELECTOR\]', data.patch)
for data in repo.pull_request(prs.number).files())
print('HELP.SELECTOR present in file: ', search_string_found)
Upvotes: 0