rmalhotra
rmalhotra

Reputation: 51

Can the notebook generated from papermill be outputted with a live running kernel?

When papermill generates a notebook, a .ipynb file is created in the output path that says it is not running in the jupyter home page. I would prefer that when the notebook has finished executing, it remains running with a live kernel so I can interact with any variables inside of it. Instead now I have to re-run the cells to get the variables that were generated in the notebook. This is especially cumbersome for any time intensive notebooks.

I am generating the notebooks using execute_notebook function.

My feeling is that this is not possible because while the new notebook is being executed it never shows "Running" in my jupyter homepage. Is what I am asking for even possible with papermill or is there another way of achieving this that is scalable?

Upvotes: 5

Views: 3191

Answers (4)

Eduardo
Eduardo

Reputation: 1423

This is not possible with papermill (without too much extra work) since it runs your code in a separate process.

However, you can do it with ploomber-engine; it's similar to papermill, except it runs the notebook in the same process, allowing you to extract variables from the notebook after it runs.

from ploomber_engine.ipython import PloomberClient

# initialize client
client = PloomberClient.from_path("notebook.ipynb")

# this will run the notebook and expose all the variables/functions
namespace = client.get_namespace()

# extract variables or any other object
some_variable = namespace["some_variable"]
some_function = namespace["some_function"]

assert var == 42
assert some_function(1, 41) == 42

Here's a complete example.

Upvotes: 1

mlukas79
mlukas79

Reputation: 11

As far as I am aware there are several options for that. Papermill used to allow recording variables in the notebook using papermill.record(), which has been deprecated; I believe you can get an older version and still use it.

Another option they suggest is to use scrapbook. You can find more about it here.

You can also use %store magic: Share data between IPython Notebooks

Finally you can simply write into flat files either by using python's context manager functionality:

with open('<dir>', 'w') as file:
    file.write(<var_of_choice>)

import json    
with open(<out_path>, "a+") as file:
    json.dump(<var_of_choice>, file)

If your notebooks load a lot of data it may be sub-optimal to leave kernels running.

Upvotes: 1

Pyrce
Pyrce

Reputation: 8571

You could implement this by following the extending papermill docs to implement a custom engine which links to a live kernel, or leaves the kernel up post-execution. This would require a little bit of custom code to avoid nbconvert from stopping the kernel and/or to have the target kernel passed into papermill's execute function. Possible, but not out of the box.

Upvotes: 2

Inon Peled
Inon Peled

Reputation: 711

Keeping the kernel running sounds indeed useful, and I too could not find support for this in Papermill documentation.

It appears that the kernel may not run with any user interface, e.g., any local port that you can browse to, so that even if it remained running after execution, you would not be able to interact with it anyway.

However, it seems that you do not need to re-run anything in the saved notebooks to recover already computed variables, as you can simply use papermill.read_notebook, no?

Upvotes: 1

Related Questions