user15964
user15964

Reputation: 2639

dask processes scheduler is not performing well

I define a cpu-bound function

def countdown(n):
    while n > 0:
        n -= 1

running countdown(50000000) takes 2.16 seconds on my laptop.

First, I test multiprocess parallelization.

from multiprocess import Pool
with Pool(2) as p:
    l=p.map(countdown,[50000000,50000000])

takes 2.46 seconds, which is a good parallelization.

Then, I test dask processes scheduler parallelization

l=[dask.delayed(countdown)(50000000),dask.delayed(countdown)(50000000)]
dask.compute(l,scheduler='processes',num_workers=2)

however, it takes 4.53 seconds. This is the same speed as

dask.compute(l,scheduler='threads',num_workers=2)

What is wrong with dask processes scheduler? I expected it should be on a par with multiprocess

Upvotes: 1

Views: 443

Answers (1)

SultanOrazbayev
SultanOrazbayev

Reputation: 16581

The following works, so I'm not sure if the above is a bug or a feature:

from dask.distributed import Client

with Client(n_workers=2) as client:
    l=[dask.delayed(countdown)(50000000),dask.delayed(countdown)(50000000)]
    dask.compute(*l)

Upvotes: 1

Related Questions