Guest-01
Guest-01

Reputation: 833

is asyncio.to_thread() method different to ThreadPoolExecutor?

I see that asyncio.to_thread() method is been added @python 3.9+, its description says it runs blocking codes on a separate thread to run at once. see example below:

def blocking_io():
    print(f"start blocking_io at {time.strftime('%X')}")
    # Note that time.sleep() can be replaced with any blocking
    # IO-bound operation, such as file operations.
    time.sleep(1)
    print(f"blocking_io complete at {time.strftime('%X')}")

async def main():
    print(f"started main at {time.strftime('%X')}")

    await asyncio.gather(
        asyncio.to_thread(blocking_io),
        asyncio.sleep(1))

    print(f"finished main at {time.strftime('%X')}")


asyncio.run(main())

# Expected output:
#
# started main at 19:50:53
# start blocking_io at 19:50:53
# blocking_io complete at 19:50:54
# finished main at 19:50:54

By explanation, it seems like using thread mechanism and not context switching nor coroutine. Does this mean it is not actually an async after all? is it same as a traditional multi-threading as in concurrent.futures.ThreadPoolExecutor? what is the benefit of using thread this way then?

Upvotes: 37

Views: 30427

Answers (2)

sean mooney
sean mooney

Reputation: 69

you would use asyncio.to_tread when ever you need to call a blocking api from a third party lib that either does not have an asyncio adapter/interface or where you do not want to create one because you just need to use a limited number of functions form that lib.

a concrete example is i am currently writing a applicaiton that will eventually run as a daemon at which point it will use asyncio for its core event loop. The eventloop will involved monitoring a unix socket for notifications which will trigger the deamon to take an action.

for rapid prototyping its currently a cli but one of the depencies/external system the deamon will interact with is call libvirt, an abstraction layer for virtual machine management written in c with a python wrapper called libvirt python.

the python binding are blocking and comunitcate with the libvirt deamon over a separate unix socket with a blocking request responce protocol.

you can conceptually think of making a call to the libvirt bindings as each function internally making a http request to a server and waiting for the server to complete the action. The exact mechanics of how it does that are not important for this disucssion just that its a blocking io operation that depends on and external process that may take some time. i.e. this is not a cpu bound call and therefore it can be offloaded to a thread and awaited.

if i was to directly call “domains = libvirt.conn.listAllDomains()” in a async function that would block my asyncio event loop until i got a responce form libvirt.

so if any events were recived on the unix socket my main loop is monitoring they would not be processed while we are waiting for the libvirt deamon to look up all domains and return the list of them to us.

if i use “domains = await asyncio.to_thread(libvirt.conn.listAllDomains)” however the await call will suspend my current coroutine until we get the responce, yeilding execution back to the asyncio event loop. that means if the daemon recives a notification while we are waiting on libvirt it can be schduled to run concurrently instead of being blocked.

in my application i will also need to read and write to linux speical files in /sys. linux has natiave aio file support which can be used with asyncio vai aiofile however linux does not supprot the aio interface for managing special files, so i would have to use blocking io.

one way to do that in a async applicaiton would be to wrap function that writes to the special files asyncio.to_thread.

i could and might use a decorator to use run_in_executor directly since i own the write_sysfs function but if i did not then to_thread is more polite then monkeypatching someone else’s lib and less work then creating my own wrapper api.

hopefully those are useful examples of where you might want to use to_thread. its really just a convince function and you can use run_in_executor to do the same thing with so addtional overhead.

if you need to support older python release you might also prefer run_in_executor since it predates the intorduction of to_thread but if you can assume 3.9+ then its a nice addtion to leverage when you need too.

Upvotes: 6

alex_noname
alex_noname

Reputation: 32233

Source code of to_thread is quite simple. It boils down to awaiting run_in_executor with a default executor (executor argument is None) which is ThreadPoolExecutor.

In fact, yes, this is traditional multithreading, сode intended to run on a separate thread is not asynchronous, but to_thread allows you to await for its result asynchronously.

Also note that the function runs in the context of the current task, so its context variable values will be available inside the func.

async def to_thread(func, /, *args, **kwargs):
    """Asynchronously run function *func* in a separate thread.
    Any *args and **kwargs supplied for this function are directly passed
    to *func*. Also, the current :class:`contextvars.Context` is propogated,
    allowing context variables from the main thread to be accessed in the
    separate thread.
    Return a coroutine that can be awaited to get the eventual result of *func*.
    """
    loop = events.get_running_loop()
    ctx = contextvars.copy_context()
    func_call = functools.partial(ctx.run, func, *args, **kwargs)
    return await loop.run_in_executor(None, func_call)

Upvotes: 53

Related Questions