Dask: many small workers vs a big worker

Question

I am trying to understand this simple example from the dask-jobqueue documentation:

from dask_jobqueue import PBSCluster

cluster = PBSCluster(cores=36,
                    memory"100GB",
                    project='P48500028',
                    queue='premium',
                    walltime='02:00:00')

cluster.start_workers(100)  # Start 100 jobs that match the description above

from dask.distributed import Client
client = Client(cluster)    # Connect to that cluster

I think it means that there will be 100 jobs each using 36 cores.

Let's say I can use 48 cores on a cluster.

Should I use 1 worker with 48 cores or 48 workers of 1 core each?

MRocklin · Accepted Answer

If your computations mostly release the GIL, then you'll probably want several threads per process. This is true if you're doing mostly Numpy, Pandas, Scikit-Learn, Numba/Cython programming on numeric data. I might do something like six processes with eight cores each.

If your computations are mostly pure Python code, for example you process text data, or iterate heavily with Python for loops over dicts/list/etc then you'll want fewer threads per process, maybe two.

Dask: many small workers vs a big worker

Answers (1)

Related Questions