ericmjl
ericmjl

Reputation: 14684

Behaviour of dask client.submit

With the following example:

from random import random
def add_random(x):
    return x + random()

results = []
for i in range(200):
    results.append(client.submit(add_random, 2))
results[0]

I noticed that all of the futures in results have the same key as results[0]. Consequently, all of the individual results in results have identical values.

On the other hand, if I make each function call unique:

def addone(x, i):
    return x + 1

results = []
for i in range(200):
    results.append(client.submit(addone, 2, i))
results[0]

Each future has a unique key, and all results in the results list are unique.

Is this expected behavior? I initially assumed that in the first case, I should get the result that I instead got in the second case.

Upvotes: 2

Views: 444

Answers (1)

MRocklin
MRocklin

Reputation: 57261

By default Dask assumes that all functions passed to it are deterministic, that is given the same inputs they produce the same outputs. This helps us to deduplicate work.

In the case of your function this isn't true, it returns a different value given the same inputs due to the random() call. You can override the deterministic behavior by specifying the pure=False keyword argument to submit.

future = client.submit(func, x, pure=False)

Upvotes: 2

Related Questions