dragonfire88
dragonfire88

Reputation: 1

Python regex to find all digits not working

For some reason, my list named lst is always empty despite myfile.txt containing numbers scattered in it. I can't figure out why. Can someone advise me? Thanks!

import re
lst = ()

fname = "myfile.txt"
try:
    with open(fname, 'r') as fh:
        for line in fh:
            lst = [int(s) for s in re.findall(r'\b\d+\b',line)]
        print lst
except IOError:
    print "Error reading file!"

Upvotes: 0

Views: 122

Answers (3)

yael
yael

Reputation: 337

with open(fname, 'r') as fh:

    data = fh.read()

    lst = re.findall(r'\b\d+\b',data)

    print lst

Upvotes: 0

Morgan G
Morgan G

Reputation: 3109

So a couple issues, if you want to grab every digit don't use '\b\d+\b' as that will only grab digits that are not in front of or behind words. Example: it will get "23 street" -> '23' but if you want to get "23rd street" -> '23' you need to use '\d+\'. Otherwise one of the others commenting is correct, it's because your print lst is outside of the for loop. I would be willing to bet the last line of myfile.txt does not contain any digits that are by themsevles, so it never prints anything. There are a few ways to fix it, I'll show you two ways.

import re
lst = ()

fname = "myfile.txt"
try:
    with open(fname, 'r') as fh:
        for line in fh:
            lst = [int(s) for s in re.findall(r'\d+',line)]
            print lst
except IOError:
    print "Error reading file!"

This is the easier way and you deal with less code, but if you want all of the objects to live inside one array when you print it, you could do something like this too.

import re
lst = []

fname = "myfile.txt"
try:
    with open(fname, 'r') as fh:
        for line in fh:
            lst.append([int(s) for s in re.findall(r'\b\d+\b',line)])
        print lst
except IOError:
    print "Error reading file!"

You'll notice on line 2 with this one I turned it into a list instead of a tuple. This is so we can use .append on the list comprehension, and print it afterwards.

Upvotes: 1

FatalError
FatalError

Reputation: 54551

Right now you're not accumulating numbers through the whole file, but replacing the list every time. If you meant to get them all, something like this would be more appropriate:

import re

lst = []

fname = "myfile.txt"
try:
    with open(fname, 'r') as fh:
        for line in fh:
            lst.extend(int(s) for s in re.findall(r'\b\d+\b',line))
        print lst
except IOError:
    print "Error reading file!"

Upvotes: 1

Related Questions