profj
profj

Reputation: 331

Jupyter Notebook nbconvert without magic commands/ w/o markdown

I have a Jupyter notebook and I'd like to convert it into a Python script using the nbconvert command from within the Jupyter notebook.

I have included the following line at the end of the notebook:

!jupyter nbconvert --to script <filename>.ipynb

This creates a Python script. However, I'd like the resulting .py file to have the following properties:

  1. No input statements, such as:

    # In[27]:

  2. No markdown, including statements such as:

    # coding: utf-8

  3. Ignore %magic commands such as:

    1. %matplotlib inline
    2. !jupyter nbconvert --to script <filename>.ipynb, i.e. the command within the notebook that executes the Python conversion

    Currently, the %magic commands get translated to the form: get_ipython().magic(...), but these are not necessarily recognized in Python.

Upvotes: 7

Views: 3473

Answers (5)

Josiah Yoder
Josiah Yoder

Reputation: 3756

This is just a quote of an answer to a different question. Upvotes there first, please.

You really should consider using [jupytext][1]

Run conda install jupytext or pip install jupytext

Then do: jupytext --set-formats ipynb,py <file>.ipynb

This will create the .ipynb file and for an additional bonus keep it synchronized to the .py file:

jupytext --set-formats ipynb,py <file>.ipynb --sync

This will make sure jupyter keeps the two files in sync when saving from now on...

Upvotes: 0

Aaron Ciuffo
Aaron Ciuffo

Reputation: 903

EDIT - 27 November 2023

The Jupytext plugin for Jupyter (notebook and lab) is an excellent solution for converting notebooks to pure python. Jupytext builds and maintains a near live connection between .ipynb and .py files. Editing one results in updates made to the other.

This makes it trivial to pull changes in from collaborators that work in either the notebook or flat text files. The cleanest solution is to ask Jupytext to use the "light" format when converting Notebooks to python. The default is to use the "percent" format which surrounds the plain-text version of notebook code from each cell with # %%. This is a bit ugly to read if you're looking for a "clean" output.

This solution should also exclude Markdown from the .py files.

To default to the light. format, create a jupytext.toml file in the root of your project and adding the following line:

jupytext.toml:

formats = "ipynb,py:light"

To include a shebang line on the first line of the .py file, add the following line somewhere in your notebook (e.g. last cell). Run this once -- you may need to delete the .py to handle the out-of-sync changes this causes.

!jupytext YourNotebookNameHere.ipynb --update-metadata '{"jupytext":{"executable":"/usr/bin/env python"}}' --to py

Also a good solution, but dated

I've been struggling to get the nbconvert module to install reliably under python 3.11 on all platforms. Because of this, I've moved to Jupytext (see above)

Jupyter nbconvert has made this a little bit easier with a new template structure.

Templates should be placed in the template path. This can be found by running jupyter --paths

Each template should be placed in its own directory within the template directory and must contain a conf.json and index.py.j2 file.

This solution covers all the details for adding a template.

This template will remove all the of the markdown, magic and cell numbers leaving a "runnable" .py file. Run this template from within a notebook with !jupyter nbconvert --to python --template my_clean_python_template my_notebook.ipynb

index.py.j2

{%- extends 'null.j2' -%}

## set to python3
{%- block header -%}
#!/usr/bin/env python3
# coding: utf-8
{% endblock header %}

## remove cell counts entirely
{% block in_prompt %}
{% if resources.global_content_filter.include_input_prompt -%}
{% endif %}
{% endblock in_prompt %}

## remove markdown cells entirely
{% block markdowncell %}
{% endblock markdowncell %}

{% block input %}
{{ cell.source | ipython2python }}
{% endblock input %}


## remove magic statement completely
{% block codecell %}
{{'' if "get_ipython" in super() else super() }}
{% endblock codecell%}

Upvotes: 2

Andrey Kuehlkamp
Andrey Kuehlkamp

Reputation: 420

Hopefully this will spare people from wasting 2 hours trying to use nbconvert template structure. This is what did the job for me:

nbconvert --to python --RegexRemovePreprocessor.patterns="^%" analysis.ipynb

Upvotes: 3

krvkir
krvkir

Reputation: 821

The most obvious solution seems to work for me:

 jupyter nbconvert --to python a_notebook.ipynb --stdout | grep -v -e "^get_ipython" | python

Of course, you can't use something like dirs = !ls in your notebooks for this to work.

Upvotes: 3

Doug Hudgeon
Doug Hudgeon

Reputation: 1529

One way to get control of what appears in the output is to tag the cells that you don't want in the output and then use the TagRemovePreprocessor to remove the cells.

enter image description here

The code below also uses the exclude_markdown function in the TemplateExporter to remove markdown.

!jupyter nbconvert \
    --TagRemovePreprocessor.enabled=True \
    --TagRemovePreprocessor.remove_cell_tags="['parameters']" \
    --TemplateExporter.exclude_markdown=True \
    --to python "notebook_with_parameters_removed.ipynb"

To remove the commented lines and the input statement markets (like # [1]), I believe you'll need to post-process the Python file with something like the following in the cell after the cell you call !jupyter nbconvert from (note that this is Python 3 code):

import re
from pathlib import Path
filename = Path.cwd() / 'notebook_with_parameters_removed.py'
code_text = filename.read_text().split('\n')
lines = [line for line in code_text if len(line) == 0 or 
        (line[0] != '#' and 'get_ipython()' not in line)]
clean_code = '\n'.join(lines)
clean_code = re.sub(r'\n{2,}', '\n\n', clean_code)
filename.write_text(clean_code.strip())

Upvotes: 10

Related Questions