ling
ling

Reputation: 1675

How does this yield work in this generator?

def read_large_file(file_handler, block_size=10000):
    block = []
    for line in file_handler:
        block.append(line)
        if len(block) == block_size:
            yield block
            block = []

    # don't forget to yield the last block
    if block:
        yield block

with open(path) as file_handler:
    for block in read_large_file(file_handler):
        print(block)

I am reading this piece of code above written by another. For this line:

if len(block) == block_size:
   yield block
   block = []

Does the block=[] have a chance to be executed? I had thought yield is like a return statement. Also, why is there an if block checking?

Upvotes: 2

Views: 104

Answers (2)

Z4-tier
Z4-tier

Reputation: 7978

yes, it will be executed when the function resumes on the next iteration. Remember, yield is like a pause button for a generator, and generators are usually used within a loop. The yield is sort of returning a value (i say "sort of", because yield is not the same as return), but when the generator is next accessed, it will pick up at that same spot. The purpose of block = [] is to reset the block to an empty list before the next go around (it might be faster to use block.clear() instead).

This code is building up blocks from a file, and handing them back to the caller as soon as they are sufficiently large. The last if block is to return the last bit, if there is some leftover that didn't fit in a complete block.

Upvotes: 1

Mureinik
Mureinik

Reputation: 311308

yield produces the next output of the generator and then allows it to continue generating values.

Here, lines are read in to a block (a list of lines). Whenever a block is populated with enough lines it's yielded as the next value from the generator, and then the block is re-initialized to an empty list, and the reading can continue.

Upvotes: 1

Related Questions