Reputation: 353
We have an Azure Databricks cluster with a Virtual Network, and a Network Security Group that only allows connections beetween cluster nodes and not internet access.
When we are trying to import a library via PyPI and start de cluster, this error is given by the cluster:
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/azure-datalake-store/
The courious thing is that if you try to import a Maven library is works properly.
Anybody knows how to solve this issue.
Thanks.
Upvotes: 0
Views: 1538
Reputation: 353
We have opened in the Network Security Group the Ip range 151.101.0.0/16 and port 443 and the PYPI libraries works
Upvotes: 0
Reputation: 2483
Log4j ships with databricks so it probably found it in a local cache. If you try something random on maven it should fail.
As for pypi - well you can’t connect directly so you cannot add libraries that way. Instead manually download the library to your desktop and install manually from the ui.
You will need to manually upload the library file to dbfs. Use the cli or powershell to do this. Then add the library using to add > library option in the workspace. Link to your file location.
Upvotes: 0