Astrum
Astrum

Reputation: 41

How to make APScheduler ProcessPoolExecutor close process upon completion and only spawn processes it needs?

ProcessPoolExecutor spawns a bunch of processes that handle apscheduler jobs. I expect the processes spawned by ProcessPoolExecutor to actually be shutdown upon successful completion of the job and a new process to be spawned for the next execution of said job. I also expect to not spawn processes if there is no need to. This however doesn't happen. If I set max workers to 10, 10 processes will be spawned. Even if the only job has max_instances of 3. After one of the processes completes the job, the process isn't reclaimed and is merely repurposed for the next run of said job.

I'll give an example:

Create an apscheduler utilizing BlockingScheduler using ProcessPoolExecutor as its executor.

def printing_job():
    print("print this...")

def main():
    executors = {
        'default': ProcessPoolExecutor(max_workers=10)
    }
    job_defaults = {
        'coalesce': False,
        'max_instances': 3,
        'misfire_grace_time': None
    }
    scheduler = BlockingScheduler(executors=executors,
                                  daemonic=True,
                                  daemon=True)
    scheduler.add_job(printing_job, 'interval', seconds=1)
    scheduler.start()

11 processes are spawned, 10 scheduler processes and the main process:

user   61428  59435 18 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61456  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61457  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61458  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61459  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61460  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61461  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61462  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61463  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61464  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py
user   61465  61428  0 16:25 pts/2    00:00:00 ../bin/python3 ./test.py

Only 3 are being used, by the job. I should see at most 4 proceses. The processes should be reaped and re-created.

Is this paradigm not possible with APScheduler?

ProcessPoolExecutor states that max_workers, is the number of workers to be spawned at most. The key word being at most, which to me indicates it shouldn't spawn more then it needs.

Upvotes: 4

Views: 1159

Answers (1)

Michael Tamillow
Michael Tamillow

Reputation: 425

I solved this more or less by customizing a function to generalize the type of parallel and/or concurrent process I'd like to execute. Inside that function you can do the same thing as you are hoping to here. I could not resolve this any other way.

Upvotes: 0

Related Questions