Reputation: 919
I have a simple docker image with the following dockerfile:
FROM python:2.7-onbuild
RUN python -m nltk.downloader 'punkt'
Whenever this image is built, it downloads the package from nltk. How can I cache it ?
Upvotes: 1
Views: 364
Reputation: 568
The clue is the 'onbuild' in the FROM line: the package is executing extra instructions, which may include something like the following:
ONBUILD ADD . /app/src
ONBUILD RUN /usr/local/bin/python-build --dir /app/src
causing whatever files are in your current directory, notably including any you've changed - e.g. your Dockerfile, unless you've excluded it with .dockerignore - to be pulled into your image, and then built.
It's unfortunate that Docker can't print the cause of cache misses in a meaningful way, but hashes have no concept of 'nearby'.
Upvotes: 0
Reputation: 4185
This is as expected. I see two options:
python:2.7-onbuild
) that has NLTK and the data preloaded and use this for your image. Try something like this one perhaps.Upvotes: 1