s1mc0d3
s1mc0d3

Reputation: 523

Accessing bokeh server using dask distributed via ssh tunnelling

Issue
I am setting up a cluster for running image analysis (moving from MPI to Dask and Dask.distributed). I connect to the master node via tunneling and I don't know how to access bokeh server.

Steps
1. Connect to my server master node via ssh tunneling:
ssh -L 7000:localhost:7000 [email protected]
2. Start dask-scheduler --port 7001 --bokeh 7002
3. ssh to the nodes I want to use (also tunneling on port 7000) and start dask-worker --memory-limit=200e9
4. Start a jupyter notebook --port=7000 --no-browser and open a chromesession and point the browser to localhost:7000
5. Start a Client() pointing to the scheduler address
6. X11 forwarding is broken and I cannot use it from my laptop

When I look at the output from the dask-scheduler page i get:

distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO -   Scheduler at: tcp://130.237.132.207:7001
distributed.scheduler - INFO -        http at:              0.0.0.0:9786
distributed.scheduler - INFO -       bokeh at:              0.0.0.0:7002
distributed.scheduler - INFO - Local Directory:    /tmp/scheduler-4we9jlcj
distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO - Register tcp://192.168.0.3:43973
distributed.scheduler - INFO - Starting worker compute stream, 
tcp://192.168.0.3:43973
distributed.scheduler - INFO - Receive client connection: Client-6967349a-
872f-11e7-a595-0cc47a8ebf44

and the client seems to connect correctly to the workers:

Scheduler: tcp://130.237.132.207:7001
Dashboard: http://130.237.132.207:7002
Workers: 1
Cores: 56
Memory: 200.00 GB  

Questions
1) Is it correct to point the browser to port 7000 instead of port 7001 where the schedule is set? FYI: I cannot load anything from the browser if I use localhost:7001 or any of the IP addressed of scheduler and dashboard. 2) How can I get access to the bokeh graph to evaluate performance?
3) Additional bonus: is there a way that I can start multiple workers with dask-ssh and passing parameters such as --memory-limit

Thanks!

Upvotes: 1

Views: 1282

Answers (1)

MRocklin
MRocklin

Reputation: 57281

It looks like you are hosting your bokeh dashboard on port 7002. You need to set up a second ssh tunnel for that port as well. This might look like the following:

ssh -L 7002:localhost:7002 [email protected]
open http://localhost:7002

Passing through keywords to dask-ssh sounds like a good idea. I recommend opening an issue and, if you have time time, perhaps a pull request :)

Upvotes: 2

Related Questions