Reputation: 174
I am trying to apply a function to a Dask array, using apply_along_axis, and while the same function works on a numpy array, it does not work on a Dask array. Here is an example:
import dask.array as da
w = numpy.array([[6,7,8],[9,10,11]])
q = numpy.array([[1,2,3],[4,5,6]])
s = numpy.stack([w,q])
def func(arr):
t, y = arr[0], arr[1]
return arr[0] + arr[1]
s_dask = da.from_array(s)
Running func on the numpy array works as expected, whereas running it on a Dask array throws an error: IndexError: index 1 is out of bounds for axis 0 with size 1"
>>>s
array([[[ 6, 7, 8],
[ 9, 10, 11]],
[[ 1, 2, 3],
[ 4, 5, 6]]])
>>>numpy.apply_along_axis(func,0,s)
array([[ 7, 9, 11],
[13, 15, 17]])
>>>da.apply_along_axis(func,0,s_dask)
Traceback (most recent call last):
File "<pyshell#151>", line 1, in <module>
da.apply_along_axis(func,0,s_dask)
File "..Python37\lib\site-packages\dask\array\routines.py", line 383, in apply_along_axis
test_result = np.array(func1d(test_data, *args, **kwargs))
File "<pyshell#149>", line 2, in func
t, y = a[0],a[1]
IndexError: index 1 is out of bounds for axis 0 with size 1
I am not sure what I am doing wrong here
Upvotes: 2
Views: 268
Reputation: 57271
Dask array is trying to figure out what the dtype of the output array. To do this it sends a small empty array through your function. That small empty array is failing because your function assumes that the input is at least of size two.
You can save Dask the trouble by providing the dtype explicitly.
da.apply_along_axis(func, 0, s_dask, dtype=s_dask.dtype)
Upvotes: 3