xssl
xssl

Reputation: 65

How to graceful shut down coroutines with Ctrl+C?

I'm writing a spider to crawl web pages. I know asyncio maybe my best choice. So I use coroutines to process the work asynchronously. Now I scratch my head about how to quit the program by keyboard interrupt. The program could shut down well after all the works have been done. The source code could be run in python 3.5 and is attatched below.

import asyncio
import aiohttp
from contextlib import suppress

class Spider(object):
    def __init__(self):
        self.max_tasks = 2
        self.task_queue = asyncio.Queue(self.max_tasks)
        self.loop = asyncio.get_event_loop()
        self.counter = 1

    def close(self):
        for w in self.workers:
            w.cancel()

    async def fetch(self, url):
        try:
            async with aiohttp.ClientSession(loop = self.loop) as self.session:
                with aiohttp.Timeout(30, loop = self.session.loop):
                    async with self.session.get(url) as resp:
                        print('get response from url: %s' % url)
        except:
            pass
        finally:
            pass

    async def work(self):
        while True:
            url = await self.task_queue.get()
            await self.fetch(url)
            self.task_queue.task_done()

    def assign_work(self):
        print('[*]assigning work...')
        url = 'https://www.python.org/'
        if self.counter > 10:
            return 'done'
        for _ in range(self.max_tasks):
            self.counter += 1
            self.task_queue.put_nowait(url)

    async def crawl(self):
        self.workers = [self.loop.create_task(self.work()) for _ in range(self.max_tasks)]
        while True:
            if self.assign_work() == 'done':
                break
            await self.task_queue.join()
        self.close()

def main():
    loop = asyncio.get_event_loop()
    spider = Spider()
    try:
        loop.run_until_complete(spider.crawl())
    except KeyboardInterrupt:
        print ('Interrupt from keyboard')
        spider.close()
        pending  = asyncio.Task.all_tasks()
        for w in pending:
            w.cancel()
            with suppress(asyncio.CancelledError):
                loop.run_until_complete(w)
    finally:
        loop.stop()
        loop.run_forever()
        loop.close()

if __name__ == '__main__':
    main()

But if I press 'Ctrl+C' while it's running, some strange errors may occur. I mean sometimes the program could be shut down by 'Ctrl+C' gracefully. No error message. However, in some cases the program will be still running after pressing 'Ctrl+C' and wouldn't stop until all the works have been done. If I press 'Ctrl+C' at that moment, 'Task was destroyed but it is pending!' would be there.

I have read some topics about asyncio and add some code in main() to close coroutines gracefully. But it not work. Is someone else has the similar problems?

Upvotes: 5

Views: 2705

Answers (2)

Mikhail Gerasimov
Mikhail Gerasimov

Reputation: 39546

I bet problem happens here:

except:
    pass

You should never do such thing. And your situation is one more example of what can happen otherwise.

When you cancel task and await for its cancellation, asyncio.CancelledError raised inside task and shouldn't be suppressed anywhere inside. Line where you await of your task cancellation should raise this exception, otherwise task will continue execution.

That's why you do

task.cancel()
with suppress(asyncio.CancelledError):
    loop.run_until_complete(task)  # this line should raise CancelledError, 
                                   # otherwise task will continue

to actually cancel task.

Upd:

But I still hardly understand why the original code could quit well by 'Ctrl+C' at a uncertain probability?

It dependence of state of your tasks:

  1. If at the moment you press 'Ctrl+C' all tasks are done, non of them will raise CancelledError on awaiting and your code will finished normally.
  2. If at the moment you press 'Ctrl+C' some tasks are pending, but close to finish their execution, your code will stuck a bit on tasks cancellation and finished when tasks are finished shortly after it.
  3. If at the moment you press 'Ctrl+C' some tasks are pending and far from being finished, your code will stuck trying to cancel these tasks (which can't be done). Another 'Ctrl+C' will interrupt process of cancelling, but tasks wouldn't be cancelled or finished then and you'll get warning 'Task was destroyed but it is pending!'.

Upvotes: 3

Alfe
Alfe

Reputation: 59436

I assume you are using any flavor of Unix; if this is not the case, my comments might not apply to your situation.

Pressing Ctrl-C in a terminal sends all processes associated with this tty the signal SIGINT. A Python process catches this Unix signal and translates this into throwing a KeyboardInterrupt exception. In a threaded application (I'm not sure if the async stuff internally is using threads, but it very much sounds like it does) typically only one thread (the main thread) receives this signal and thus reacts in this fashion. If it is not prepared especially for this situation, it will terminate due to the exception.

Then the threading administration will wait for the still running fellow threads to terminate before the Unix process as a whole terminates with an exit code. This can take quite a long time. See this question about killing fellow threads and why this isn't possible in general.

What you want to do, I assume, is kill your process immediately, killing all threads in one step.

The easiest way to achieve this is to press Ctrl-\. This will send a SIGQUIT instead of a SIGINT which typically influences also the fellow threads and causes them to terminate.

If this is not enough (because for whatever reason you need to react properly on Ctrl-C), you can send yourself a signal:

import os, signal

os.kill(os.getpid(), signal.SIGQUIT)

This should terminate all running threads unless they especially catch SIGQUIT in which case you still can use SIGKILL to perform a hard kill on them. This doesn't give them any option of reacting, though, and might lead to problems.

Upvotes: 0

Related Questions