Anatoliy Sokolov
Anatoliy Sokolov

Reputation: 397

How to match a file line by line with a regular expression in python

I have a file in python that has one word on each line and I need to return a list of every word that matches with a passed in regular expression with a function load_words. For example:load_words("words",r"^[A-Z].{2}$") should return ['A-1', 'AAA', 'AAE'] and others which makes sense because these 3 all fit the expression where you start with a capital letter then have 2 of anything. Here is my current function:

def load_words(filename,regexp):
    f=open(filename)
    t=[]
    x=None
    for line in f:
        x=(re.match(regexp,line))
        if x!=None:
            t.append(x)
    return t

I try to read the file line by line and if the line matches with the expression, I add it to the list. Im not quite sure what function in re I need to match regular expressions with srings so I might be using the wrong one in all likeliness because my output is looks like addresses instead of strings.

Upvotes: 0

Views: 109

Answers (2)

yael
yael

Reputation: 337

def load_words(filename,regexp):

f=open(filename)

data = f.read()

t = re.findall(regexp,data,re.MULTILINE)

return t

Upvotes: 0

alecxe
alecxe

Reputation: 473803

You are collecting the Match objects in a list, but need to get the matches from the group. Replace:

t.append(x)

with:

t.append(x.group(0))

Note that you don't need to check the x not to be None and can simply check for it to be "truthy":

x = re.match(regexp, line)
if x:
    t.append(x.group(0))

Upvotes: 1

Related Questions