Soul Donut
Soul Donut

Reputation: 382

Pythonic way to check generator values using its own elements

Say I'm reading in a file line by line, and want ensure that a certain character (e.g., a delimiter in tabular data) appears the same number of times in each line, based on its count in the first line of the file. The best I could come up with is this approach using a second generator, but I do not feel that this obeys 'there should be one-- and preferably only one --obvious way to do it'. Would using closures conform to this maxim, or is there some other obvious way that I'm failing to see?

def check_line(char='|'):
    while True:
        line_count = line.count(char)
        if i == 0:
           correct_count = line_count
        if line_count != correct_count:
           print """Line %d contains %d %s, should have %d""" % (
                 i, line_count, char, correct_count)
        yield None

with open('file.txt', 'rb') as f:
     checker = check_line()
     for i, line in enumerate(f):
         checker.next()
         ### do more things with line 

I'm trying to abstract from my general problem a bit since I'm most curious about the idiomatic implementation, rather than solving this specific issue with files/delimiters.

Upvotes: 0

Views: 73

Answers (2)

Sebastian Hoffmann
Sebastian Hoffmann

Reputation: 2914

To get rid of globals, pass the read lines into your generator:

def check_line(char='|'):
    correct_count = None

    while True:
        line = yield None
        line_count = line.count(char)
        if not correct_count:
           correct_count = line_count

        if line_count != correct_count:
           print """Line %d contains %d %s, should have %d""" % (
                 i, line_count, char, correct_count)

with open('file.txt', 'rb') as f:
     checker = check_line()
     for i, line in enumerate(f):
         checker.next(line)

A better solution would be to completely wrap a generator:

def check_line(gen, char='|'):
    correct_count = None
    injected = None

    while True:
        i, line = gen.next(injected)
        line_count = line.count(char)

        if not correct_count:
           correct_count = line_count

        if line_count != correct_count:
           print """Line %d contains %d %s, should have %d""" % (
                 i, line_count, char, correct_count)

        injected = yield i, line

with open('file.txt', 'rb') as f:
     checker = check_line(enumerate(f))
     for i, line in checker:
         print line

Upvotes: 2

Ella Rose
Ella Rose

Reputation: 546

I don't think closures or generators are necessary. If it can be done with just a simple function definition, why should it use anything more complicated? Maybe I'm not understanding what you want well enough, but what I'd do is:

def check_lines(_file, delimiter):
    correct_count = _file.readline().count(delimiter)
    for line_number, line in enumerate(_file, start=2): #lets say the first line is line 1, then the first line in this for loop will be line 2
        line_count = line.count(delimiter)
        if line_count != correct_count:
            format_args = (line_number, line_count, delimiter, correct_count)
            print("""Line {} contains {} {}, should have {}""".format(*format_args))

with open(__file__, 'r') as _file:
    check_lines(_file, '(')

I don't know what the body your function is supposed to actually do. It can return or yield whatever information you need it to, instead of printing to stdout.

Upvotes: 2

Related Questions