Reputation: 8490
We're looking at using dask, in particular its lazy compute and dag capabilities.
We have a moderately complicated compute dag, with unknown inputs. We want to be able to build it ahead of time, and then use it on different inputs.
I think we can do this with the dict / tuple interface:
from dask.threaded import get
import pandas as pd
power = lambda x, y: x**y
dsk = {'x': pd.Series(pd.np.random.rand(20)),
'y': 2,
'z': (power, 'x', 'y'),
'w': (sum, ['x', 'y', 'z'])}
Then we have dsk
as the portable dag, and can replace x
with whatever we want. (indeed, we didn't need to include it above initially).
dsk['x'] = pd.Series(pd.np.random.rand(20))
get(dsk, 'w')
But can we do this with dask.imperative
? My initial results suggest that we can't get to x
:
x=pd.Series()
def filter_below_3(ds):
return ds[ds<3]
f=do(filter_below_3)
graph=f(x)
graph.dask
# {'filter_below_3-0ae5a18c-206d-4293-84b6-eb0d39243296': (<function __main__.filter_below_3>, [])}
Is there a way?
Upvotes: 0
Views: 267
Reputation: 57281
dask.do and dask.value were both renamed to dask.delayed a long while ago. See the changelog for more information.
Currently there is no standard way to swap out leaf values within dask.imperative. However, there are a couple of decent options.
Dask.imperative just builds a dict for you. You can swap out values after you construct the dictionary.
from operator import add, mul
from dask import do, value
from dask.threaded import get
input = value('dummy-value', name='my-special-input')
x = do(add)(input, 1)
y = do(mul)(x, x)
dsk = y.dask
>>> dsk['my-special-input'] = 10
>>> get(dsk, y.key)
121
All dask imperative graphs should be fairly cheap to construct. You could create a function to produce your graph for each input
def f(input):
x = do(add)(input, 1)
y = do(mul)(x, x)
return y
>>> f(10).compute()
121
Upvotes: 1