power
power

Reputation: 1760

Dask on single OSX machine - is it parallel by default?

I have installed Dask on OSX Mojave. Does it execute computations in parallel by default? Or do I need to change some settings?

I am using the DataFrame API. Does that make a difference to the answer?

I installed it with pip. Does that make a difference to the answer?

Upvotes: 0

Views: 217

Answers (1)

mdurant
mdurant

Reputation: 28673

Yes, Dask is parallel by default.

Unless you specify otherwise, or create a distributed Client, execution will happen with the "threaded" scheduler, in a number of threads equal to your number of cores. Note, however, that because of the python GIL (only one python instruction executed at a time), you may not get as much parallelism as available, depending on how good your specific tasks are at releasing the GIL. That is why you have a choice of schedulers.

Being on OSX, installing with pip: these make no difference. Using dataframes makes a difference in that it dictates the sorts of tasks you're likely running. Pandas is good at releasing the GIL for many operations.

Upvotes: 1

Related Questions