Michael Anderson
Michael Anderson

Reputation: 73600

How can I handle several regexp cases neatly in python

So I get some input in python that I need to parse using regexps.

At the moment I'm using something like this:

matchOK = re.compile(r'^OK\s+(\w+)\s+(\w+)$')
matchFailed = re.compile(r'^FAILED\s(\w+)$')
#.... a bunch more regexps

for l in big_input:
  match = matchOK.search(l)
  if match:
     #do something with match
     continue
  match = matchFailed.search(l)
  if match:
     #do something with match
     continue
  #.... a bunch more of these 
  # Then some error handling if nothing matches

Now usually I love python because its nice and succinct. But this feels verbose. I'd expect to be able to do something like this:

for l in big_input:      
  if match = matchOK.search(l):
     #do something with match     
  elif match = matchFailed.search(l):
     #do something with match 
  #.... a bunch more of these
  else
    # error handling

Am I missing something, or is the first form as neat as I'm going to get?

Upvotes: 5

Views: 186

Answers (4)

eat
eat

Reputation: 7530

class helper:
    def __call__(self, match):
        self.match= match
        return bool(match)

h= helper()
for l in big_input:      
    if h(matchOK.search(l)):
        # do something with h.match     
    elif h(matchFailed.search(l)):
        # do something with h.match 
    ... # a bunch more of these
    else:
        # error handling

Or matchers as class methods:

class matcher:
    def __init__(self):
        # compile matchers
        self.ok= ...
        self.failed= ...
        self....= ...

    def matchOK(self, l):
        self.match= self.ok(l)
        return bool(self.match)

    def matchFailed(self, l):
        self.match= self.failed(l)
        return bool(self.match)

    def match...(self, l):
        ...

m= matcher()
for l in big_input:      
    if m.matchOK(l):
        # do something with m.match     
    elif m.matchFailed(l):
        # do something with m.match 
    ... # a bunch more of these
    else:
        # error handling

Upvotes: 3

Tom Anderson
Tom Anderson

Reputation: 47253

Even better, how about a slightly simpler version of eat's code using a nested function:

import re

matchOK = re.compile("ok")
matchFailed = re.compile("failed")
big_input = ["ok to begin with", "failed later", "then gave up"]

for l in big_input:
    match = None
    def matches(pattern):
        global match
        match = pattern.search(l)
        return match
    if matches(matchOK):
        print "matched ok:", l, match.start()
    elif matches(matchFailed):
        print "failed:", l, match.start()
    else:
        print "ignored:", l

Note that this will work if the loop is part of the top level of the code, but is not easily converted into a function - the variable match still has to be a true global at the top level.

Upvotes: -1

eyquem
eyquem

Reputation: 27585

And something like that ? :

import re


def f_OK(ch):
    print 'BINGO ! : %s , %s' % re.match('OK\s+(\w+)\s+(\w+)',ch).groups()

def f_FAIL(ch):
    print 'only one : ' + ch.split()[-1]

several_func = (f_OK, f_FAIL)


several_REs = ('OK\s+\w+\s+\w+',
               'FAILED\s+\w+')

globpat = re.compile(')|('.join(several_REs).join(('^(',')$')))




with open('big_input.txt') as handle:
    for i,line in enumerate(handle):
        print 'line '+str(i)+' - ',
        mat = globpat.search(line)
        if mat:
            several_func[mat.lastindex-1](mat.group())
        else:
            print '## no match ## '+repr(line)

I tried it on a file whose content is:

OK tiramisu sunny   
FAILED overclocking   
FAILED nuclear    
E = mcXc    
OK the end  

the result is

line 0 -  BINGO ! : tiramisu , sunny
line 1 -  only one : overclocking
line 2 -  only one : nuclear
line 3 -  ## no match ## 'E = mcXc\n'
line 4 -  BINGO ! : the , end

This allow you to define quantities of REs and functions separatly, to add some, to remove some, etc

Upvotes: 0

Tom Anderson
Tom Anderson

Reputation: 47253

How about something like:

for l in big_input:
    for p in (matchOK, matchFailed): # other patterns go in here
        match = p.search(l)
        if match: break
    if (not match): p = None # no patterns matched
    if (p is matchOK):
        # do something with match
    elif (p is matchFailed):
        # do something with match
    #.... a bunch more of these 
    else:
        assert p is None
        # Then some error handling if nothing matches

Upvotes: 0

Related Questions