Reputation: 33
I am trying to find all matches for a capital letter followed by a parentheses i.e. A), B), C) etc. I tested it out with re-try (http://re-try.appspot.com/) and it works perfectly, but when I implement it in my program it gives me an error message: sre_constants.error: unbalanced parenthesis
parens = re.findall(r'[A-Z]\)', test_file)
It seems that it is ignoring the escape character, but why is that? Any help or alternative approaches would be much appreciated.
Upvotes: 3
Views: 223
Reputation: 35522
This works:
>>> st='first A) you have B) and then C)'
>>> re.findall(r'[A-Z]\)',st)
['A)', 'B)', 'C)']
or:
>>> re.findall('[A-Z]\\)',st)
['A)', 'B)', 'C)']
Is test_file
a string actually? What you have should work (try it in the Python shell) and so my suspicion is your second parameter to re.findall
...
If, as your naming would suggest, it is a file object, you would need to do this:
with open('file.txt','r') as f:
for line in f:
line_matches=re.findall(pattern,line)
... do something with a list of matches from that line
... next line
or, for the whole file
with open('file.txt', 'r') as f:
contents=f.read()
file_matches=re.findall(pattern,contents,flags=re.MULTILINE)
... do something with a list of matches from the whole file
or, the encoding of that file may be wrong...
Upvotes: 2
Reputation: 4467
The regular expression is fine, the problem might be the encoding of test_file.
Check this Python Unicode Regular Expression to see if anything helps.
Upvotes: 0