Joseph Nicolls
Joseph Nicolls

Reputation: 33

How to properly use for-loop with python's regex library?

I am trying to implement a program that will take a file, find all the regex matches associated with the document, and concatnate specific matches I want into a single string, which is written onto a file.

import re
import sys

f = open ('input/' + sys.argv[1], "r")
fd = f.read()
s = ''

pattern = re.compile(r'(?:(&#\d*|>))(.*?)(?=(&#\d*|<))')

for e in re.findall(pattern, fd, re.S)
        s += e[1]

f.close()
o = open ( 'output' + sys.argv[1], 'w', 0)
o.write(s)
o.close()

However, when I try to run this, I get the following error:

 File "./regex.py", line 8
    for e in re.findall(pattern, fd, re.S)

If

Upvotes: 1

Views: 66

Answers (2)

plamut
plamut

Reputation: 3206

Not directly related to the original question (it was indeed a missing colon), but I suggest to take a different approach to string concatenation. Repeatedly appending new chunks will create a new string each time (because strings are immutable). A better way would be to create an accumulator list, append each matched string to it and then join these strings into a single one using ''.join(my_list_with_matches).

Upvotes: 0

Eli Rose
Eli Rose

Reputation: 7048

You forgot a colon at the end of that line.

for e in re.findall(pattern, fd, re.S):

You seem to have chopped off the type of the error (SyntaxError, I imagine) but that information is very helpful. Seeing SyntaxError instead of some other type would let you know that your error has nothing to do with regexes.

Upvotes: 1

Related Questions