nazar
nazar

Reputation: 167

Dask Distributed produces AttributeError: 'HighLevelGraph' object has no attribute '__dask_distributed_pack__'

I have a small development cluster on 3 AWS T2 machines. One machine serves as the client, one as scheduler and finally one as worker. On all of them I performed a git clone and manually installed Numpy version 1.21.0 on all 3. However, when following the basic setup, the error bellow is produced when executing A = client.map(square, range(10)) on the Python3(3.8) interpreter. How can this issue be fixed? Seems like an internal error, Dask was acquired with pip install on Client machine.

ubuntu@ip-172...:~$ python3
Python 3.8.5 (default, Jan 27 2021, 15:41:15) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dask.distributed import Client
>>> client = Client('IPv4Addr:8786')
>>> client
<Client: 'tcp://172...:8786' processes=1 threads=4, memory=8.18 GB>
>>> def square(x):
...     return x ** 2
... 
>>> A = client.map(square, range(10))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/distributed-2021.2.0+19.g2c5d2cf8-py3.8.egg/distributed/client.py", line 1764, in map
    futures = self._graph_to_futures(
  File "/usr/local/lib/python3.8/dist-packages/distributed-2021.2.0+19.g2c5d2cf8-py3.8.egg/distributed/client.py", line 2542, in _graph_to_futures
    dsk = dsk.__dask_distributed_pack__(self, keyset)
AttributeError: 'HighLevelGraph' object has no attribute '__dask_distributed_pack__'

Upvotes: 1

Views: 1367

Answers (2)

Boris Lau
Boris Lau

Reputation: 55

We had the same error message. It turned out, that we had a version mismatch between the packages dask and distributed. Somehow distributed got upgraded to 2021.3.0, while dask was still at 2020.12.0. Downgrading distributed to the old version fixed the problem.

Upvotes: 0

nazar
nazar

Reputation: 167

To anyone having the same issue, a possible fix (the one that worked for us) would be to create a virtual environment where you will install Dask and all its dependencies.

1- Install Dask on newly created venv

2- Produce a requirements specification with $ pip freeze > ~/requirements.txt

3- On worker and client machines create a venv and perform a $ pip install -r requirements.txton said env.

This will guarantee identical environments, and hopefully prevent various issues such as the one detailed on the original question.

Upvotes: 2

Related Questions