Reputation: 69
I have a series of computations on some data which I’m modelling as a graph with dask delayed, and works well, however the graph itself takes longer (or a comparable time) to create than the calculations take to run.
I add data throughout the day, so would like to be able to change the inputs without recreating the graph, is there a way to do this?
Upvotes: 1
Views: 124
Reputation: 16551
This is an advanced topic, so I am going to provide only a somewhat-hacky solution:
import dask
from dask.multiprocessing import get
@dask.delayed()
def myfunc(x):
return x+1
nested = 0
for x in range(1, 3):
nested = myfunc(x*nested, dask_key_name=f'{x}')
# 1*0 + 1 = 1 -> 2*1 + 1 = 3
print(nested.compute())
dag_modified = nested.dask.to_dict()
dag_modified['1'] = modified_dag['1'][0], 2
# 1*2 + 1 = 3 -> 2*3 + 1 = 7
print(get(dag_modified, '2'))
Upvotes: 1