Kermit
Kermit

Reputation: 5992

Docker - how to ensure commit will persist a file?

I keep doing a pull, run, <UPLOAD FILE>, commit, tag, push cycle only to be dismayed that my file is gone when I pull the pushed container. My goal is to include an ipynb file with my image that serves as a README/ tutorial for my users.

Reading other posts, I see that commit is/ isn't the way to add a file. What causes commit to persist/ disregard a file? Am I supposed to use docker cp to add the file before commiting?

Upvotes: 0

Views: 109

Answers (2)

Kermit
Kermit

Reputation: 5992

For those of you looking for some place to start with docker build, below are the lines in the Dockerfile that I triggered with docker build -t your-image-name:your-new-t

Dockerfile

FROM jupyter/datascience-notebook:latest
MAINTAINER name <email>


# ====== PRE SUDO ======
ENV JUPYTER_ENABLE_LAB=yes

# If you run pip as sudo it continually prints errors.
# Tidyverse is already installed, and installing gorpyter installs the correct versions of other Python dependencies.
RUN pip install gorpyter
# commenting out our public repo
ENV R_HOME=/opt/conda/lib/R

# https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s09.html
# Looks like /usr/local/man is symlinking all R/W toward /usr/local/share/man instead
COPY python_sdk.ipynb /usr/local/share/man
COPY r_sdk.ipynb /usr/local/share/man
ENV NOTEBOOK_DIR=/usr/local/share/man
WORKDIR /usr/local/share/man


# ====== SUDO ======
USER root

# Spark requires Java 8.
RUN sudo apt-get update && sudo apt-get install openjdk-8-jdk -y
ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

# If you COPY files into the same VOLUME that you mount in docker-compose.yml, then those files will disappear at runtime.
# `user_notebooks/` is the folder that gets mapped as a VOLUME to the user's local folder during runtime.
RUN mkdir /usr/local/share/man/user_notebooks

Upvotes: 0

Paul Becotte
Paul Becotte

Reputation: 9977

If you need to publish your notebook file in a docker image, use a Dockerfile, something like this-

FROM jupyter/datascience-notebook
COPY mynotebook.ipynb /home/jovyan/work

Then, once you have your notebook the way you want it, just run docker build, docker push. To try and help you a bit more, the reason you are having your problem is that the jupyter images store the notebooks in a volume. Data in a volume is not part of the image, it lives on the filesystem of the host machine. That means that a commit isn't going to save anything in the work folder.

Really, an ipynb is a data file, not an application. The right way to do this is probably to just upload the ipynb file to a file store somewhere and tell your users to download it, since they could use one docker image to run many data files. If you really want a prebuilt image using the workflow you described, you could just put the file somewhere else that isn't in a volume so that it gets captured in your commit.

Upvotes: 1

Related Questions