Reputation: 39889
The subject contains the whole idea. I came accross code sample where it shows something like:
async for item in getItems():
await item.process()
And others where the code is:
for item in await getItems():
await item.process()
Is there a notable difference in these two approaches?
Upvotes: 7
Views: 1281
Reputation: 18588
While both of them could theoretically work with the same object (without causing an error), they most likely do not. In general those two notations are not equivalent at all, but invoke entirely different protocols and are applied to very distinct use cases.
To understand the difference, you first need to understand the concept of an iterable.
Abstractly speaking, an object is iterable, if it implements the __iter__
method or (less common for iteration) a sequence-like __getitem__
method.
Practically speaking, an object is iterable, if you can use it in a for
-loop, so for _ in iterable
. A for
-loop implicitly invokes the __iter__
method of the iterable and expects it to return an iterator, which implements the __next__
method. That method is called at the start of each iteration in the for
-loop and its return value is what is assigned to the loop variable.
The async
-world introduced a variation of that, namely the asynchronous iterable.
An object is asynchronously iterable, if it implements the __aiter__
method.
Again, practically speaking, an object is asynchronously iterable, if it can be used in an async for
-loop, so async for _ in async_iterable
. An async for
-loop calls the __aiter__
method of the asynchronous iterable and expects it to return an asynchronous iterator, which implements the __anext__
coroutine method. That method is awaited at the start of each iteration of the async for
-loop.
Typically speaking, an asynchronous iterable is not awaitable, i.e. it is not a coroutine and it does not implement an __await__
method and vice versa. Although they are not necessarily mutually exclusive. You could design an object that is both awaitable by itself and also (asynchronously) iterable, though that seems like a very strange design.
Just to be very clear in the terminology used, the iterator is a subtype of the iterable. Meaning an iterator also implements the iterable protocol by providing an __iter__
method, but it also provides the __next__
method. Analogously, the asynchronous iterator is a subtype of the asynchronous iterable because it implements the __aiter__
method, but also provides the __anext__
coroutine method.
You do not need the object to be an iterator for it to be used in a for
-loop, you need it to return an iterator. The fact that you can use an (asynchronous) iterator in a (async
) for
-loop is because it is also an (asynchronous) iterable. It is just rare for something to be an iterable but not an iterator. In most cases the object will be both (i.e. the latter).
async for _ in get_items()
That code implies that whatever is returned by the get_items
function is an asynchronous iterable.
Note that get_items
is just a normal non-async
function, but the object it returns implements the asynchronous iterable protocol. That means we could write the following instead:
async_iterable = get_items()
async for item in async_iterable:
...
for _ in await get_items()
Whereas this snippet implies that get_items
is in fact a coroutine function (i.e. a callable returning an awaitable) and the return value of that coroutine is a normal iterable.
Note that we know for certain that the object returned by the get_items
coroutine is a normal iterable because otherwise the regular for
-loop would not work with it. The equivalent code would be:
iterable = await get_items()
for item in iterable:
...
Another implication of those code snippets is that in the first one the function (returning the asynchronous iterator) is non-asynchronous, i.e. calling it will not yield control to the event loop, whereas each iteration of the async for
-loop is asynchronous (and thus will allow context switches).
Conversely, in the second one the function returning the normal iterator is an asynchronous call, but all of the iterations (the calls to __next__
) are non-asynchronous.
The practical takeaway should be that those two snippets you showed are never equivalent. The main reason is that get_items
either is or is not a coroutine function. If it is not, you cannot do await get_items()
. But whether or not you can do async for
or for
depends on whatever is returned by get_items
.
For the sake of completion, it should be noted that combinations of the aforementioned protocols are entirely feasible, although not all too common. Consider the following example:
from __future__ import annotations
class Foo:
x = 0
def __iter__(self) -> Foo:
return self
def __next__(self) -> int:
if self.x >= 2:
raise StopIteration
self.x += 1
return self.x
def __aiter__(self) -> Foo:
return self
async def __anext__(self) -> int:
if self.x >= 3:
raise StopAsyncIteration
self.x += 1
return self.x * 10
async def main() -> None:
for i in Foo():
print(i)
async for i in Foo():
print(i)
if __name__ == "__main__":
from asyncio import run
run(main())
In this example, Foo
implements four distinct protocols:
def __iter__
)def __next__
)def __aiter__
)async def __anext__
)Running the main
coroutine gives the following output:
1
2
10
20
30
This shows that objects can absolutely be all those things at the same time. Since Foo
is both a synchronous and an asynchronous iterable, we could write two functions -- one coroutine, one regular -- that each returns an instance of Foo
and then replicate your example a bit:
from collections.abc import AsyncIterable, Iterable
def get_items_sync() -> AsyncIterable[int]:
return Foo()
async def get_items_async() -> Iterable[int]:
return Foo()
async def main() -> None:
async for i in get_items_sync():
print(i)
for i in await get_items_async():
print(i)
async for i in await get_items_async():
print(i)
if __name__ == "__main__":
from asyncio import run
run(main())
Output:
10
20
30
1
2
10
20
30
This illustrates very clearly that the only thing determining which of our Foo
methods is called (__next__
or __anext__
) is whether we use a for
-loop or an async for
-loop.
The for
-loop will always call the __next__
method at least once and continue calling it for each iteration, until it intercepts a StopIteration
exception.
The async for
-loop will always await the __anext__
coroutine at least once and continue calling and awaiting it for each subsequent iteration, until it intercepts a StopAsyncIteration
exception.
Upvotes: 6
Reputation: 92854
Those are completely different.
This for item in await getItems()
won't work (will throw an error) if getItems()
is an asynchronous iterator or asynchronous generator, it may be used only if getItems
is a coroutine which, in your case, is expected to return a sequence object (simple iterable).
async for
is a conventional (and pythonic) way for asynchronous iterations over async iterator/generator.
Upvotes: 0