Reputation: 143
I decided to try and use Google Cloud Datalab for a small project that I'm working on rather than a Jupyter Notebook in an Anaconda environment on an AWS instance.
How can I install a package (for example OpenCV) onto the Datalab VM so that I don't have to reinstall it every time I restart my VM? Why do the packages disappear after every restart but the updated notebooks remain persistent? Any help answering these questions and clarifying how the Datalab VM works would be very helpful.
Upvotes: 2
Views: 1040
Reputation: 5225
The notebooks are stored in a docker volume mount that represents a location on the persistent disk that is maintained across restarts of the VM.
The packages you install however are stored in the running container and hence lost on each restart.
You could create a custom docker image and use that instead. On the datalab create
command, see the --image-name
argument.
Here is an example of a Dockerfile you'll want to use:
FROM gcr.io/cloud-datalab/datalab:latest
RUN pip install opencv
Note that you'll need build the docker image using this docker file, and push the image to Google Container Registry. My memory is a bit fuzzy on this, but it is possible this image needs to be marked as public.
Hope that helps!
Upvotes: 2