Reputation: 1675
def read_large_file(file_handler, block_size=10000):
block = []
for line in file_handler:
block.append(line)
if len(block) == block_size:
yield block
block = []
# don't forget to yield the last block
if block:
yield block
with open(path) as file_handler:
for block in read_large_file(file_handler):
print(block)
I am reading this piece of code above written by another. For this line:
if len(block) == block_size:
yield block
block = []
Does the block=[]
have a chance to be executed? I had thought yield
is like a return
statement. Also, why is there an if block
checking?
Upvotes: 2
Views: 104
Reputation: 7978
yes, it will be executed when the function resumes on the next iteration. Remember, yield is like a pause button for a generator, and generators are usually used within a loop. The yield is sort of returning a value (i say "sort of", because yield
is not the same as return
), but when the generator is next accessed, it will pick up at that same spot. The purpose of block = []
is to reset the block to an empty list before the next go around (it might be faster to use block.clear()
instead).
This code is building up blocks from a file, and handing them back to the caller as soon as they are sufficiently large. The last if
block is to return the last bit, if there is some leftover that didn't fit in a complete block.
Upvotes: 1
Reputation: 311308
yield
produces the next output of the generator and then allows it to continue generating values.
Here, lines are read in to a block (a list of lines). Whenever a block is populated with enough lines it's yielded as the next value from the generator, and then the block is re-initialized to an empty list, and the reading can continue.
Upvotes: 1