Reputation: 1
After reading answer1 and answer2, purpose of yield
still looks unclear.
In this first case, with the below function,
def createGenerator():
mylist = range(3)
for i in mylist:
yield i*i
On invoking createGenerator
, below,
myGenerator = createGenerator()
should return object(like (x*x for x in range(3))
) of type collections.abc.Generator
type, is-a collections.abc.Iterator
& collections.abc.Iterable
To iterate over myGenerator
object and get first value(0
),
next(myGenerator)
would actually make for
loop of createGenerator
function to internally invoke __iter__(myGenerator)
and retrieve collections.abc.Iterator
type object( obj
(say) ) and then invoke __next__(obj)
to get first value(0
) followed by the pause of for
loop using yield
keyword
If this understanding(above) is correct, then,
then, does the below syntax(second case),
def createGenerator():
return (x*x for x in range(3))
myGen = createGenerator() # returns collections.abc.Generator type object
next(myGen) # next() must internally invoke __next__(__iter__(myGen)) to provide first value(0) and no need to pause
wouldn't suffice to serve the same purpose(above) and looks more readable? Aren't both syntax memory efficient? If yes, then, when should I use yield
keyword? Is there a case, where yield
could be a must use?
Upvotes: 1
Views: 704
Reputation: 231738
The generator function and the generator comprehension are basically same - both produce generator objects:
In [540]: def createGenerator(n):
...: mylist = range(n)
...: for i in mylist:
...: yield i*i
...:
In [541]: g = createGenerator(3)
In [542]: g
Out[542]: <generator object createGenerator at 0xa6b2180c>
In [545]: gl = (i*i for i in range(3))
In [546]: gl
Out[546]: <generator object <genexpr> at 0xa6bbbd7c>
In [547]: list(g)
Out[547]: [0, 1, 4]
In [548]: list(gl)
Out[548]: [0, 1, 4]
Both g
and gl
have the same attributes; produce the same values; run out in the same way.
Just as with a list comprehension, there are things you can do in the explicit loop that you can't with the comprehension. But if the comprehension does the job, use it. Generators were added to Python sometime around version 2.2. Generator comprehensions are newer (and probably use the same underlying mechanism).
In Py3 range
, or Py2 xrange
produces values one at a time, as opposed to a whole list. It's a range
object, not a generator, but works in much the same way. Py3 has extended this in other ways, such as the dictionary keys
and map
. Sometimes that's a convenience, other times I forget to wrap them in the list()
.
The yield
can be more elaborate, allowing 'feedback' for the caller. e.g.
In [564]: def foo(n):
...: i = 0
...: while i<n:
...: x = yield i*i
...: if x is None:
...: i += 1
...: else:
...: i = x
...:
In [576]: f = foo(3)
In [577]: next(f)
Out[577]: 0
In [578]: f.send(-3) # reset the counter
Out[578]: 9
In [579]: list(f)
Out[579]: [4, 1, 0, 1, 4]
The way I think of an generator operating is that creation initializes an object with code and initial state. next()
runs it up to the yield
, and returns that value. The next next()
lets it spin again until it hits a yield
, and so on until it hits a stop iteration
condition. So it's a function that maintains an internal state, and can called repeatedly with the next
or for
iteration. With send
and yield from
and so on generators
can be much more sophisticated.
Normally a function runs until done, and returns. The next call to the function is independent of the first - unless you use globals or error prone defaults.
https://www.python.org/dev/peps/pep-0289/ is the PEP for generator expressions, from v 2.4.
This PEP introduces generator expressions as a high performance, memory efficient generalization of list comprehensions [1] and generators [2] .
https://www.python.org/dev/peps/pep-0255/ PEP for generators, v.2.2
Upvotes: 0
Reputation: 96349
There already is a good answer about the capability to send
data into a generator with yield. Regarding just readability considerations, while certainly simple, straightforward transformations can be more readable as generator expressions:
(x + 1 for x in iterable if x%2 == 1)
Certain operations are easier to read and understand using a full generator definition. Certain cases are a headache to fit into a generator expression, try the following:
>>> x = ['arbitrarily', ['nested', ['data'], 'can', [['be'], 'hard'], 'to'], 'reach']
>>> def flatten_list_of_list(lol):
... for l in lol:
... if isinstance(l, list):
... yield from flatten_list_of_list(l)
... else:
... yield l
...
>>> list(flatten_list_of_list(x))
['arbitrarily', 'nested', 'data', 'can', 'be', 'hard', 'to', 'reach']
Sure, you might be able to hack up a solution that fits on a single line using lambda
s to achieve recursion, but it will be an unreadable mess. Now imagine I had some arbitrarily nested data-structure that involved list
and dict
, and I have logic to handle both cases... you get the point I think.
Upvotes: 0
Reputation: 3442
Try doing this without yield
def func():
x = 1
while 1:
y = yield x
x += y
f = func()
f.next() # Returns 1
f.send(3) # returns 4
f.send(10) # returns 14
The generator has two important features:
The generator some state (the value of x
). Because of this state, this generator could eventually return any number of results without using huge amounts of memory.
Because of the state and the yield
, we can provide the generator with information that it uses to compute its next output. That value is assigned to y
when we call send
.
I don't think this is possible without yield
.
That said, I'm pretty sure that anything you can do with a generator function can also be done with a class.
Here's an example of a class that does exactly the same thing (python 2 syntax):
class MyGenerator(object):
def __init__(self):
self.x = 1
def next(self):
return self.x
def send(self, y):
self.x += y
return self.next()
I didn't implement __iter__
but it's pretty obvious how that should work.
Upvotes: 4
Reputation: 807
Think of yield as a "lazy return". In your second example, your function does not return a "generator of values", but rather a fully evaluated list of values. This may be perfectly acceptable depending on the use case. Yield is useful when proccessing large batches of streamed data, or when dealing with data that is not immediately available (think asynchronous operations).
Upvotes: 1