Reputation: 3486
Given a dask.delayed
task, I want to get a list of all the inputs (parents) for that task.
For example,
from dask import delayed
@delayed
def inc(x):
return x + 1
def inc_list(x):
return [inc(n) for n in x]
task = delayed(sum)(inc_list([1,2,3]))
task.parents ???
Yields the following graph. How could I get the parents of sum#3
such that it yields a list of [inc#1, inc#2, inc#3]
?
Upvotes: 1
Views: 229
Reputation: 57251
Delayed objects don't store references to their inputs, however you can get these back if you're willing dig into the task graph a bit and reconstruct Delayed objects manually.
In particular you can index into the .dask
attribute with the delayed objects' key
>>> task.dask[task.key]
(<function sum>,
['inc-9d0913ab-d76a-4eb7-a804-51278882b310',
'inc-2f0e385e-beef-45e5-b47a-9cf5d02e2c1f',
'inc-b72ce20f-d0c4-4c50-9a88-74e3ef926dd0'])
This shows the task definition (see Dask's graph specification)
The 'inc-...'
values are other keys in the task graph. You can get the dependencies using the dask.core.get_dependencies
function
>>> from dask.core import get_dependencies
>>> get_dependencies(task.dask, task.key)
{'inc-2f0e385e-beef-45e5-b47a-9cf5d02e2c1f',
'inc-9d0913ab-d76a-4eb7-a804-51278882b310',
'inc-b72ce20f-d0c4-4c50-9a88-74e3ef926dd0'}
And from here you can make new delayed objects if you wish
>>> from dask.delayed import Delayed
>>> parents = [Delayed(key, task.dask) for key in get_dependencies(task.dask, task.key)]
[Delayed('inc-b72ce20f-d0c4-4c50-9a88-74e3ef926dd0'),
Delayed('inc-2f0e385e-beef-45e5-b47a-9cf5d02e2c1f'),
Delayed('inc-9d0913ab-d76a-4eb7-a804-51278882b310')]
Upvotes: 1