Reputation: 319
I have this code:
import re
#TEST CASES
match_dict = ['hello(here)',
'Hello (Hi)',
"'dfsfds Hello (Hi) fdfd' Hello (Yes)",
"Hello ('hi)xx')",
"Hello ('Hi')"]
for s in match_dict:
print "INPUT: %s" % s
m = re.sub(r"(?<!\()'[^']+'", '', s, flags=re.M)
paren_quotes = re.findall(r"Hello\s*\('([^']+)'\)", m, flags=re.M)
output = paren_quotes if paren_quotes else []
m = re.sub(r"Hello\s*\('[^']+'\)", '', m, flags=re.M)
paren_matches = re.findall(r"Hello\s*\(([^)]+)\)", m, flags=re.M)
if paren_matches:
output.extend(paren_matches)
print 'OUTPUT: %s\n' % output
This code is made to output everything in the parentheses after the word 'Hello',
Hello (Hi) would give 'Hi'
My problem is that when I put in:
Hello('Hi')
...It still returns 'Hi'
when I want it to return "'Hi'"
Does anyone know how could I fix this code?
Upvotes: 2
Views: 84
Reputation: 298582
Just use non-greedy matching:
matches = re.search(r'^Hello\s*\((.*?)\)', text)
Upvotes: 5
Reputation: 21
>>> import re
>>> p = re.compile(r'Hello\s*\((.*?)\)', re.M)
>>> m = p.findall("Hello ('Hi')")
>>> print m
["'Hi'"]
>>> m = p.findall("'dfsfds Hello (Hi) fdfd' Hello (Yes)")
>>> print m
['Hi', 'Yes']
Upvotes: 2