Reconnecting remote Jupyter Notebook and get current cell output

I'm currently training a neural network on a remote server, using jupyter notebook. I set it up with the following:

tmux
jupyter-notebook --no-browser --port=5000
connecting to jupyter notebook with a browser and executing the cell for the training (output was fine when I watched for the first 10 minutes)
detach tmux (ctrl-b, d) and closing the browser tab

Now, when I reconnect to the jupyter notebook in the browser, I don't see the current output of the training cell, only the output that I saw when I was watching the first 10 minutes of training.

I tried to find a solution for this and, I think, there are some git issues for this certain problem but they are old and I couldn't figure out if this issue was solved or not.

edit// to make my intentions more clear, since I found some threads on StackOverflow that are addressing this problem: I don't want to wait for the training to complete, as I might want to kill the training before it finishes, when it absolutely doesn't go they way I would expect it to go. So some sort of 'live' output or at least regular output would be nice.

Upvotes: 61

Answers (4)

Mercury

Reputation: 4171

This is a long-running missing feature in jupyter notebooks. I use a near-identical setup: my notebook runs inside a tmux session in a remote server, and I use it locally with ssh tunneling.

Before doing any work, I run the following snippet in the first cell:

import sys
import logging

nblog = open("nb.log", "a+")
sys.stdout.echo = nblog
sys.stderr.echo = nblog

get_ipython().log.handlers[0].stream = nblog
get_ipython().log.setLevel(logging.INFO)

%autosave 5

Now let's say, I run a cell that will take a while to complete (like a training run). Something like:

import time

def train(num_epochs):
    for epoch in range(num_epochs):
        time.sleep(1)
        print(f"Completed epoch {epoch}")

train(1000)

Now while train(1000) is running, after the first 10 seconds, I want to do something else and close the browser, and also disconnect from my remote connection.

(Note the modified short autosave duration; I added that as I often forget to save the notebook before closing the browser tab.)

After 500 seconds have passed, I can reconnect to the remote server and open the notebook in my browser. My logs of this cell will have stopped printing after "Completed epoch 9", i.e. when I disconnected. However, the kernel will still actually be running train in the backend, and it will also show "busy".

We can now just simply open up the file nb.log and we'll find all the logs, including the ones after we closed the browser and connection. We can keep refreshing the nb.log file at our leisure and new logs will keep coming up, till the kernel finishes running train().

Now if we want to stop train() before it's done, we can just press the Interrupt button in jupyter. The kernel will be freed and we can run other stuff (And a Keyboard Interrupt error message will also show up in your nb.log file). All our precomputed notebook variables and imported libraries are still there, as the kernel wasn't actually disconnected.

Although this isn't a very sophisticated solution, I find it quite easy to implement

Upvotes: 16

Mishak

Reputation: 21

I'm curently facing the same problem and I found this discussion. Mentioned Papermill works quite well. Just use something like:

nohup papermill --request-save-on-cell-execute --no-progress-bar input.ipynb output.ipynb &

input.ipnb notebook with your sourcecode.

output.ipnb processed notebook where you can see the output.

--request-save-on-cell-execute prints cell output into the output.ipnb notebook after the cell is completed.

--no-progress-bar disables showing progress bar which is quite useless if you do all the work in one cell.

nohup is there so papermill keeps running after you logout from server and $ to perform it in backgroud.

All Papermill options can be found there.

Upvotes: 0

ligand

Reputation: 310

This is a still OPEN issue in Jupiter Notebook Official website. See https://github.com/jupyterlab/jupyterlab/issues/2833 "Reconnect to running session: keeping output"

Upvotes: 6

Armonía.Py

Reputation: 53

And if you use a .py file instead of a .ipynb file (jupyter notebook), and inside this .py file you print the results to test the operation of your code.

To convert from .ipynb to .py file you can use this command:

'jupyter nbconvert --to script example.ipynb'

Now, you can work with a python script instead a jupyter notebook file, this will make things easier.

In your script write prints() in the stages you think necessary in order that you can see it in Tmux terminal. So you can kill your training whenever you want (ctr+c) or not, Tmux can save the session if you want, just tape 'ctr-b + d' to detach from de session

Upvotes: 1

Reconnecting remote Jupyter Notebook and get current cell output

Answers (4)

Related Questions