Maxime De Bruyn
Maxime De Bruyn

Reputation: 919

docker cache not working

I have a simple docker image with the following dockerfile:

FROM python:2.7-onbuild

RUN python -m nltk.downloader 'punkt'

Whenever this image is built, it downloads the package from nltk. How can I cache it ?

Upvotes: 1

Views: 364

Answers (2)

Tim Baverstock
Tim Baverstock

Reputation: 568

The clue is the 'onbuild' in the FROM line: the package is executing extra instructions, which may include something like the following:

ONBUILD ADD . /app/src
ONBUILD RUN /usr/local/bin/python-build --dir /app/src

causing whatever files are in your current directory, notably including any you've changed - e.g. your Dockerfile, unless you've excluded it with .dockerignore - to be pulled into your image, and then built.

It's unfortunate that Docker can't print the cause of cache misses in a meaningful way, but hashes have no concept of 'nearby'.

Upvotes: 0

declension
declension

Reputation: 4185

This is as expected. I see two options:

  • Mount a volume from your host with the cached NLTK data (wherever that sits)
  • Create a base image (instead of python:2.7-onbuild) that has NLTK and the data preloaded and use this for your image. Try something like this one perhaps.

Upvotes: 1

Related Questions