Reputation: 41
ProcessPoolExecutor spawns a bunch of processes that handle apscheduler jobs. I expect the processes spawned by ProcessPoolExecutor to actually be shutdown upon successful completion of the job and a new process to be spawned for the next execution of said job. I also expect to not spawn processes if there is no need to. This however doesn't happen. If I set max workers to 10, 10 processes will be spawned. Even if the only job has max_instances of 3. After one of the processes completes the job, the process isn't reclaimed and is merely repurposed for the next run of said job.
I'll give an example:
Create an apscheduler utilizing BlockingScheduler using ProcessPoolExecutor as its executor.
def printing_job():
print("print this...")
def main():
executors = {
'default': ProcessPoolExecutor(max_workers=10)
}
job_defaults = {
'coalesce': False,
'max_instances': 3,
'misfire_grace_time': None
}
scheduler = BlockingScheduler(executors=executors,
daemonic=True,
daemon=True)
scheduler.add_job(printing_job, 'interval', seconds=1)
scheduler.start()
11 processes are spawned, 10 scheduler processes and the main process:
user 61428 59435 18 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61456 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61457 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61458 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61459 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61460 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61461 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61462 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61463 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61464 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
user 61465 61428 0 16:25 pts/2 00:00:00 ../bin/python3 ./test.py
Only 3 are being used, by the job. I should see at most 4 proceses. The processes should be reaped and re-created.
Is this paradigm not possible with APScheduler?
ProcessPoolExecutor states that max_workers, is the number of workers to be spawned at most. The key word being at most, which to me indicates it shouldn't spawn more then it needs.
Upvotes: 4
Views: 1159
Reputation: 425
I solved this more or less by customizing a function to generalize the type of parallel and/or concurrent process I'd like to execute. Inside that function you can do the same thing as you are hoping to here. I could not resolve this any other way.
Upvotes: 0