Reputation: 20301
I have a python script which process a file line by line, if the line matches a regex, it calls a function to handle it.
My question is is there a better write to refactor my script. The script works, but as it is, i need to keep indent to the right of the editor as I add more and more regex for my file.
Thank you for any idea. Now my code end up like this:
for line in fi.readlines(): result= reg1.match(line) if result: handleReg1(result) else: result = reg2.match(line) if result: handleReg2(result) else: result = reg3.match(line) if result: handleReg3(result) else: result = reg4.match(line) if result: handleReg4(result) else: result = reg5.match(line) if result: handleReg5(result)
Upvotes: 3
Views: 226
Reputation: 18816
Here's a trivial one:
handlers = { reg1 : handleReg1, ... }
for line in fi.readlines():
for h in handlers:
x = h.match(line)
if x:
handlers[h](x)
If there could be a line that matches several regexps this code will be different from the code you pasted: it will call several handlers. Adding break
won't help, because the regexps will be tried in a different order, so you'll end up calling the wrong one. So if this is the case you should iterate over list:
handlers = [ (reg1, handleReg1), (reg2, handleReg2), ... ]
for line in fi.readlines():
for reg, handler in handlers:
x = reg.match(line)
if x:
handler(x)
break
Upvotes: 1
Reputation: 29706
An alternate approach that might work for you is to combine all the regexps into one giant regexp and use m.group() to detect which matched. My intuition says this should be faster, but I haven't tested it.
>>> reg = re.compile('(cat)|(dog)|(apple)')
>>> m = reg.search('we like dogs')
>>> print m.group()
dog
>>> print m.groups()
(None, 'dog', None)
This gets complicated if the regexps you're testing against are themselves complicated or use match groups.
Upvotes: 0
Reputation: 7148
I'd switch to using a data structure mapping regexes to functions. Something like:
map = { reg1: handleReg1, reg2: handleReg2, etc }
Then you just loop through them:
for reg, handler in map.items():
result = reg.match(line)
if result:
handler(result)
break
If you need the matches to happen in a particular order you'll need to use a list instead of a dictionary, but the principal is the same.
Upvotes: 12