Reputation: 2639
I define a cpu-bound function
def countdown(n):
while n > 0:
n -= 1
running countdown(50000000)
takes 2.16 seconds on my laptop.
First, I test multiprocess
parallelization.
from multiprocess import Pool
with Pool(2) as p:
l=p.map(countdown,[50000000,50000000])
takes 2.46 seconds, which is a good parallelization.
Then, I test dask
processes scheduler parallelization
l=[dask.delayed(countdown)(50000000),dask.delayed(countdown)(50000000)]
dask.compute(l,scheduler='processes',num_workers=2)
however, it takes 4.53 seconds. This is the same speed as
dask.compute(l,scheduler='threads',num_workers=2)
What is wrong with dask processes scheduler? I expected it should be on a par with multiprocess
Upvotes: 1
Views: 443
Reputation: 16581
The following works, so I'm not sure if the above is a bug or a feature:
from dask.distributed import Client
with Client(n_workers=2) as client:
l=[dask.delayed(countdown)(50000000),dask.delayed(countdown)(50000000)]
dask.compute(*l)
Upvotes: 1