Reputation: 167
I have a small development cluster on 3 AWS T2 machines. One machine serves as the client, one as scheduler and finally one as worker. On all of them I performed a git clone
and manually installed Numpy version 1.21.0
on all 3. However, when following the basic setup, the error bellow is produced when executing A = client.map(square, range(10)) on the Python3(3.8) interpreter. How can this issue be fixed? Seems like an internal error, Dask was acquired with pip install on Client machine.
ubuntu@ip-172...:~$ python3
Python 3.8.5 (default, Jan 27 2021, 15:41:15)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dask.distributed import Client
>>> client = Client('IPv4Addr:8786')
>>> client
<Client: 'tcp://172...:8786' processes=1 threads=4, memory=8.18 GB>
>>> def square(x):
... return x ** 2
...
>>> A = client.map(square, range(10))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/dist-packages/distributed-2021.2.0+19.g2c5d2cf8-py3.8.egg/distributed/client.py", line 1764, in map
futures = self._graph_to_futures(
File "/usr/local/lib/python3.8/dist-packages/distributed-2021.2.0+19.g2c5d2cf8-py3.8.egg/distributed/client.py", line 2542, in _graph_to_futures
dsk = dsk.__dask_distributed_pack__(self, keyset)
AttributeError: 'HighLevelGraph' object has no attribute '__dask_distributed_pack__'
Upvotes: 1
Views: 1367
Reputation: 55
We had the same error message. It turned out, that we had a version mismatch between the packages dask and distributed. Somehow distributed got upgraded to 2021.3.0, while dask was still at 2020.12.0. Downgrading distributed to the old version fixed the problem.
Upvotes: 0
Reputation: 167
To anyone having the same issue, a possible fix (the one that worked for us) would be to create a virtual environment where you will install Dask and all its dependencies.
1- Install Dask on newly created venv
2- Produce a requirements specification with $ pip freeze > ~/requirements.txt
3- On worker and client machines create a venv and perform a $ pip install -r requirements.txt
on said env.
This will guarantee identical environments, and hopefully prevent various issues such as the one detailed on the original question.
Upvotes: 2