Reputation: 654
I am in the process of creating a Dockerfile that can build a haskell program. The Dockerfile uses ubuntu focal as a base image, installs ghcup, and then builds a haskell program. There are multiple reasons why I am doing this; it can support a low-configuration CI environment, and it can help new developers who are trying to build a complicated project.
In order to speed up build times, I am using docker v20 with buildkit. I have a sequence of events like this (it's quite a long file, but this excerpt is the relevant part):
# installs haskell
WORKDIR $HOME
RUN git clone https://github.com/haskell/ghcup-hs.git
WORKDIR ghcup-hs
RUN BOOTSTRAP_HASKELL_NONINTERACTIVE=NO ./bootstrap-haskell
#RUN source ~/.ghcup/env # Uh-oh: can't do this.
# We recreate the contents of ~/.ghcup/env
ENV PATH=$HOME/.cabal/bin:$HOME/.ghcup/bin:$PATH
# builds application
COPY application $HOME/application
WORKDIR $HOME/application
RUN mkdir -p logs
RUN --mount=type=cache,target=$HOME/.cabal \
--mount=type=cache,target=$HOME/.ghcup \
--mount=type=cache,target=$HOME/application/dist-newstyle \
cabal build |& tee logs/configure.log
But when I change some non-code files (README.md for example) in application
, and build my docker image ...
DOCKER_BUILDKIT=1 docker build -t application/application:1.0 .
... it takes quite a bit of time and the output from cabal build
includes a lot of Downloading [blah]
followed by Building
/Installing
/Completed
messages from cabal install.
However when I go into my container and type cabal build
, it is much faster (it is already built):
host$ docker run -it application/application:1.0
container$ cabal build # this is fast
I would expect it to be just as fast in the prior case as well. Since I have not really changed the code files, and the dependencies are all downloaded, and since I am using RUN --mount
.
Are there files somewhere that my --mount=type=cache
entries are not covering? Is there a package registry file somewhere that I need to include in its own --mount=type=cache
line? As far as I can tell, my builds ought to be nearly instant instead of taking several minutes to complete.
Upvotes: 4
Views: 442
Reputation: 772
A few years later, but I think I have the answer.
So one of the issues with OPs approach is that $HOME in that context is not going to be replaced with anything. So the target directory for the cache includes a folder called literally $HOME
.
The next was getting the right folders, you can do this by just inspecting an image generated with the "wrong" configuration (i.e. without any caching). If you use a tool like dive, you can see that the changes in that layer include:
drwx------ 0:0 14 MB └── root
drwxr-xr-x 0:0 461 kB ├── .cache
drwxr-xr-x 0:0 461 kB │ └── cabal
drwxr-xr-x 0:0 11 kB │ ├── logs
-rw-r--r-- 0:0 485 B │ │ ├── build.log
drwxr-xr-x 0:0 11 kB │ │ └── ghc-9.8.4
-rw-r--r-- 0:0 11 kB │ │ └── text-2.1.2-030cac37f8d77dcf6263d580f
drwxr-xr-x 0:0 450 kB │ └── packages
drwxr-xr-x 0:0 450 kB │ └── hackage.haskell.org
drwxr-xr-x 0:0 450 kB │ └── text
drwxr-xr-x 0:0 450 kB │ └── 2.1.2
-rw-r--r-- 0:0 450 kB │ └── text-2.1.2.tar.gz
drwxr-xr-x 0:0 13 MB └── .local
drwxr-xr-x 0:0 13 MB └── state
drwxr-xr-x 0:0 13 MB └── cabal
drwxr-xr-x 0:0 13 MB └── store
drwxr-xr-x 0:0 13 MB └── ghc-9.8.4-1b19
drwxr-xr-x 0:0 0 B ├── incoming
-rw-r--r-- 0:0 0 B │ └── text-2.1.2-030cac37f8d77dcf6
drwxr-xr-x 0:0 14 kB ├── package.db
-rw-r--r-- 0:0 8.6 kB │ ├── package.cache
-rw-r--r-- 0:0 0 B │ ├── package.cache.lock
-rw-r--r-- 0:0 5.1 kB │ └── text-2.1.2-030cac37f8d77dcf6
drwxr-xr-x 0:0 13 MB └── text-2.1.2-030cac37f8d77dcf6263d
-rw-r--r-- 0:0 497 B ├── cabal-hash.txt
drwxr-xr-x 0:0 13 MB ├── lib
drwxr-xr-x 0:0 2.5 MB │ ├── Data
drwxr-xr-x 0:0 2.1 MB │ │ ├── Text
-rw-r--r-- 0:0 12 kB │ │ │ ├── Array.dyn_hi
-rw-r--r-- 0:0 12 kB │ │ │ ├── Array.hi
(more changes omitted)
With this we can see that the actual place where files were being cached is /root/.cache/cabal
and /root/.local/state/cabal/store
. Or at least, that's the case in this configuration for GHC and cabal-install, for this example I'm using haskell:9.8.4
as base image.
With all this in mind, here is a full, multi-stage Dockerfile that leverages the build cache (the executable is called example
):
FROM haskell:9.8.4 as builder
WORKDIR /app
COPY . .
RUN --mount=type=cache,target=/root/.local/state/cabal/store \
--mount=type=cache,target=/root/.cache/cabal \
--mount=type=cache,target=./dist-newstyle \
cabal update && \
mkdir bin && \
cabal install --install-method=copy --installdir=./bin
FROM debian:12-slim
COPY --from=builder /app/bin/example /usr/local/bin/
CMD ["example"]
Upvotes: 0