Reputation: 336
I installed a Cloud Datalab notebook on a Cloud Dataproc cluster following the instructions listed in the official documentation
After creating the cluster, I then created a SSH tunnel to the master node in the Cloud Shell and connected to the cluster interface using the Cloud Shell. Instructions. I could access the Jupyter notebooks after this. I used the -v
verbose option to see the SSH connection logs:
gcloud compute ssh cluster-datalab-m --project=abcxyz-123 --zone us-west1-a \
-- -v -4 -N -L 8080:cluster-datalab-m:8080
But after some time, I got a popup saying "A connection to the notebook server could not be established. The notebook will continue trying to reconnect. Check your network connection or notebook server configuration." and my cluster stopped responding to any commands.
When I looked at the SSH output on the Cloud Shell, I saw that multiple channels were being requested at this point.
A preview of SSH logs just during the break point:
debug1: channel 1: new [direct-tcpip]
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 2: new [direct-tcpip]
debug1: channel 1: free: direct-tcpip: listening port 8080 for cluster-datalab-m port 8080, connect from 127.0.0.1 port 52832 to 127.0.0.1 port 8080, nchannels 3
debug1: channel 2: free: direct-tcpip: listening port 8080 for cluster-datalab-m port 8080, connect from 127.0.0.1 port 52833 to 127.0.0.1 port 8080, nchannels 2
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 1: new [direct-tcpip]
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 2: new [direct-tcpip]
debug1: channel 2: free: direct-tcpip: listening port 8080 for cluster-datalab-m port 8080, connect from 127.0.0.1 port 52837 to 127.0.0.1 port 8080, nchannels 3
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 2: new [direct-tcpip]
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 3: new [direct-tcpip]
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 4: new [direct-tcpip]
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 5: new [direct-tcpip]
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
debug1: channel 6: new [direct-tcpip]
debug1: Connection to port 8080 forwarding to cluster-datalab-m port 8080 requested.
I closed this SSH connection manually and then tried to SSH into the master node by clicking the SSH
button on the Compute Engine console but even that was taking lot of time and didn't complete successfully.
I looked at this stackoverflow question but I couldn't find any /etc/sshguard
folder in the master node so I don't think that is the issue for my case. The master node was running Debian 8.10.
Is there any way to ensure that the SSH connection (and the Jupyter notebook) works continuously?
Upvotes: 0
Views: 2748
Reputation: 1383
We've updated the documentation at cluster web interfaces. Using cloud shell works for Datalab, but not Jupyter. Cloud Shell Preview only supports HTTP, but Jupyter uses websockets.
Instead, you should follow the instructions for setting up a SOCKS proxy and pointing Chrome at it. There's a handy bash script called launch-jupyter-interface.sh
that does that for you. You'll just need to modify it to point to your Chrome installation.
The Jupyter tutorial also mentions using that script.
Upvotes: 1