Reputation: 4698
Local dask allows using process scheduler. Workers in dask distributed are using ThreadPoolExecutor
to compute tasks. Is it possible to replace ThreadPoolExecutor
with ProcessPoolExecutor
in dask distributed? Thanks.
Upvotes: 1
Views: 515
Reputation: 28673
The distributed scheduler allows you to work with any number of processes, via any of the deployment options. Each of these can have one or more threads. Thus, you have the flexibility to choose your favourite mix of threads and processes as you see fit.
The simplest expression of this is with the LocalCluster
(same as Client()
by default):
cluster = LocalCluster(n_workers=W, threads_per_worker=T, processes=True)
makes W
workers with T
threads each (which can be 1).
As things stand, the implementation of workers uses a thread pool internally, and you cannot swap in a process pool in its place.
Upvotes: 1