Reputation: 74134
I have studied generators feature and i think i got it but i would like to understand where i could apply it in my code.
I have in mind the following example i read in "Python essential reference" book:
# tail -f
def tail(f):
f.seek(0,2)
while True:
line = f.readline()
if not line:
time.sleep(0.1)
continue
yield line
Do you have any other effective example where generators are the best tool for the job like tail -f?
How often do you use generators feature and in which kind of functionality\part of program do you usually apply it?
Upvotes: 5
Views: 316
Reputation: 391962
In all cases where I have algorithms that read anything, I use generators exclusively.
Why?
Layering in filtering, mapping and reduction rules is so much easier in a context of multiple generators.
Example:
def discard_blank( source ):
for line in source:
if len(line) == 0:
continue
yield line
def clean_end( source ):
for line in source:
yield line.rstrip()
def split_fields( source ):
for line in source;
yield line.split()
def convert_pos( tuple_source, position ):
for line in tuple_source:
yield line[:position]+int(line[position])+line[position+1:]
with open('somefile','r') as source:
data= convert_pos( split_fields( discard_blank( clean_end( source ) ) ), 0 )
total= 0
for l in data:
print l
total += l[0]
print total
My preference is to use many small generators so that a small change is not disruptive to the entire process chain.
Upvotes: 2
Reputation: 43130
I use them a lot when I implement scanners (tokenizers) or when I iterate over data containers.
Edit: here is a demo tokenizer I used for a C++ syntax highlight program:
whitespace = ' \t\r\n'
operators = '~!%^&*()-+=[]{};:\'"/?.,<>\\|'
def scan(s):
"returns a token and a state/token id"
words = {0:'', 1:'', 2:''} # normal, operator, whitespace
state = 2 # I pick ws as first state
for c in s:
if c in operators:
if state != 1:
yield (words[state], state)
words[state] = ''
state = 1
words[state] += c
elif c in whitespace:
if state != 2:
yield (words[state], state)
words[state] = ''
state = 2
words[state] += c
else:
if state != 0:
yield (words[state], state)
words[state] = ''
state = 0
words[state] += c
yield (words[state], state)
Usage example:
>>> it = scan('foo(); i++')
>>> it.next()
('', 2)
>>> it.next()
('foo', 0)
>>> it.next()
('();', 1)
>>> it.next()
(' ', 2)
>>> it.next()
('i', 0)
>>> it.next()
('++', 1)
>>>
Upvotes: 6
Reputation: 45131
In general, to separate data aquisition (which might be complicated) from consumption. In particular:
yield
-ing records from each one, the consumer only sees single data items arriving.Generators can also work as coroutines. You can pass data into them using nextval=g.next(data)
on the 'consumer' side and data = yield(nextval)
on the generator side. In this case the generator and its consumer 'swap' values. You can even make yield
throw an exception within the generator context: g.throw(exc)
does that.
Upvotes: 1
Reputation: 816900
Whenever your code would either generate an unlimited number of values or more generally if too much memory would be consumed by generating the whole list at first.
Or if it is likely that you don't iterate over the whole generated list (and the list is very large). I mean there is no point in generating every value first (and waiting for the generation) if it is not used.
My latest encounter with generators was when I implemented a linear recurrent sequence (LRS) like e.g. the Fibonacci sequence.
Upvotes: 4