gkokaisel
gkokaisel

Reputation: 33

Regular Expression for finding a parentheses in Python

I am trying to find all matches for a capital letter followed by a parentheses i.e. A), B), C) etc. I tested it out with re-try (http://re-try.appspot.com/) and it works perfectly, but when I implement it in my program it gives me an error message: sre_constants.error: unbalanced parenthesis

parens = re.findall(r'[A-Z]\)', test_file)

It seems that it is ignoring the escape character, but why is that? Any help or alternative approaches would be much appreciated.

Upvotes: 3

Views: 223

Answers (2)

the wolf
the wolf

Reputation: 35522

This works:

>>> st='first A) you have B) and then C)'
>>> re.findall(r'[A-Z]\)',st)
['A)', 'B)', 'C)']

or:

>>> re.findall('[A-Z]\\)',st)
['A)', 'B)', 'C)']

Is test_file a string actually? What you have should work (try it in the Python shell) and so my suspicion is your second parameter to re.findall...

If, as your naming would suggest, it is a file object, you would need to do this:

with open('file.txt','r') as f:
    for line in f:
        line_matches=re.findall(pattern,line)
        ... do something with a list of matches from that line
        ... next line

or, for the whole file

with open('file.txt', 'r') as f:
    contents=f.read()
    file_matches=re.findall(pattern,contents,flags=re.MULTILINE)
    ... do something with a list of matches from the whole file

or, the encoding of that file may be wrong...

Upvotes: 2

Qiang Jin
Qiang Jin

Reputation: 4467

The regular expression is fine, the problem might be the encoding of test_file.

Check this Python Unicode Regular Expression to see if anything helps.

Upvotes: 0

Related Questions