Reputation: 23054

How to mount host volumes into docker containers in Dockerfile during build

Since 2014 when this question has been asked, many situations had happened and many things has changed. I'm revisiting the topic again today, and I'm editing this question for the 12th time to reflect the latest changes. The question may seem long but it is arranged in the reverse chronological order, so the latest changes are at the top and feel free stop reading at any point.

The question I wanted to solve was -- how to mount host volumes into docker containers in Dockerfile during build, i.e., having the docker run -v /export:/export capability during docker build.

One reason behind it, for me, is when building things in Docker, I don't want those (apt-get install) caches locked in a single docker, but to share/reuse them.

That was the main reason I was asking this question. And one more reason I'm facing today is trying to make use of a huge private repo from host which I have to otherwise do git clone from a private repo within docker using my private ssh key, which I don't know how and haven't looked into yet.

Latest Update:

The Buildkit in @BMitch's answer

With that RUN --mount syntax, you can also bind mount read-only directories from the build-context...

it has now been built-in in docker (which I thought being a third-party tool), as long as yours' over 18.09. Mine is 20.10.7 now -- https://docs.docker.com/develop/develop-images/build_enhancements/

To enable BuildKit builds

Easiest way from a fresh install of docker is to set the DOCKER_BUILDKIT=1 environment variable when invoking the docker build command, such as:

$ DOCKER_BUILDKIT=1 docker build .

Else, you'll get:

the --mount option requires BuildKit. Refer to https://docs.docker.com/go/buildkit/ to learn how to build images with BuildKit enabled

So it'll be the perfect solution to my second use-case as explained above.

Update as of May 7, 2019:

Before docker v18.09, the correct answer should be the one that starts with:

There is a way to mount a volume during a build, but it doesn't involve Dockerfiles.

However, that was a poorly stated, organized and supported answer. When I was reinstalling my docker contains, I happened to stumble upon the following article:

Dockerize an apt-cacher-ng service
https://docs.docker.com/engine/examples/apt-cacher-ng/

That's the docker's solution to this/my question, not directly but indirectly. It's the orthodox way docker suggests us to do. And I admit it is better than the one I was trying to ask here.

Another way is, the newly accepted answer, e.g., the Buildkit in v18.09.

Pick whichever suits you.

Was: There had been a solution -- rocker, which was not from Docker, but now that rocker is discontinued, I revert the answer back to "Not possible" again.

Old Update: So the answer is "Not possible". I can accept it as an answer as I know the issue has been extensively discussed at https://github.com/docker/docker/issues/3156. I can understand that portability is a paramount issue for docker developer; but as a docker user, I have to say I'm very disappointed about this missing feature. Let me close my argument with a quote from aforementioned discussion: "I would like to use Gentoo as a base image but definitely don't want > 1GB of Portage tree data to be in any of the layers once the image has been built. You could have some nice a compact containers if it wasn't for the gigantic portage tree having to appear in the image during the install." Yes, I can use wget or curl to download whatever I need, but the fact that merely a portability consideration is now forcing me to download > 1GB of Portage tree each time I build a Gentoo base image is neither efficient nor user friendly. Further more, the package repository WILL ALWAYS be under /usr/portage, thus ALWAYS PORTABLE under Gentoo. Again, I respect the decision, but please allow me expressing my disappointment as well in the mean time. Thanks.

Original question in details:

From

Share Directories via Volumes
http://docker.readthedocs.org/en/v0.7.3/use/working_with_volumes/

it says that Data volumes feature "have been available since version 1 of the Docker Remote API". My docker is of version 1.2.0, but I found the example given in above article not working:

# BUILD-USING:        docker build -t data .
# RUN-USING:          docker run -name DATA data
FROM          busybox
VOLUME        ["/var/volume1", "/var/volume2"]
CMD           ["/usr/bin/true"]

What's the proper way in Dockerfile to mount host-mounted volumes into docker containers, via the VOLUME command?

$ apt-cache policy lxc-docker
lxc-docker:
  Installed: 1.2.0
  Candidate: 1.2.0
  Version table:
 *** 1.2.0 0
        500 https://get.docker.io/ubuntu/ docker/main amd64 Packages
        100 /var/lib/dpkg/status

$ cat Dockerfile 
FROM          debian:sid

VOLUME        ["/export"]
RUN ls -l /export
CMD ls -l /export

$ docker build -t data .
Sending build context to Docker daemon  2.56 kB
Sending build context to Docker daemon 
Step 0 : FROM          debian:sid
 ---> 77e97a48ce6a
Step 1 : VOLUME        ["/export"]
 ---> Using cache
 ---> 59b69b65a074
Step 2 : RUN ls -l /export
 ---> Running in df43c78d74be
total 0
 ---> 9d29a6eb263f
Removing intermediate container df43c78d74be
Step 3 : CMD ls -l /export
 ---> Running in 8e4916d3e390
 ---> d6e7e1c52551
Removing intermediate container 8e4916d3e390
Successfully built d6e7e1c52551

$ docker run data
total 0

$ ls -l /export | wc 
     20     162    1131

$ docker -v
Docker version 1.2.0, build fa7b24f

Upvotes: 375

Answers (15)

James H

Reputation: 629

Dockerfile now has support for RUN --mount, but it's readonly access to the host machine.

To write files from container created by docker build to the host machine, I ended up using -o flag to export the image and then extract the files I need from the .tar in the host machine:

docker build -o type=tar,dest=target/java_build_image.tar .

tar -xvf target/java_build_image.tar some-dir/file-I-need-from-the-image.txt

you can also do -o target/java_build_image/ to export without tar but this is slower for my use case.

Can be a good alternative to the other approach of docker build -> docker run -> docker cp -> docker rm

Upvotes: 2

Daniel Santos

Reputation: 3295

TL;DR: Use docker run with bind-mount --volumes, install packages, then run docker commit from within the container.

Well I am ideologically opposed to Docker's philosophy. This limitation is the result of policy decisions that are not in line with allowing the user to do what they want with the software. Further, it offers poor consolation with --mount=type=cache, as I'm finding that the cache is cleared unexpectedly, forcing me to spend another 20 minutes downloading files over a slow connection.

(Optional) create a /var/cache/apt/archives overlay

I would rather use my system's apt cache, but I don't want the container modifying it. So my solution is to setup an overlay for the container to write to. (Feel free to skip this step and replace it with an empty directory somewhere.)

d=/var/cache/apt-overlay
mkdir -p $d/{upper,work,merged}
mount -t overlay -o lowerdir=/var/cache/apt/archives,upperdir=$d/upper,workdir=$d/work apt-archives $d/merge

The Solution

Use docker run to bind mount /var/cache/apt/archives, the /usr/bin/docker CLI, and /var/run/docker.sock. Then you can install your packages and run docker commit from within the container when you're done.

#!/bin/bash

img_src="debian:stable-slim"
img_dst="myimg"

set -ex
if ! docker_gid=$(grep "^docker:" /etc/group | awk -F: '{print $3}'); then
    echo "Can't get docker group id" >&1
    exit 1
fi

DOCKER_BUILDKIT=1 \
BUILDKIT_PROGRESS=plain \
docker run --rm -i \
    --volume /var/cache/apt-overlay/merged:/var/cache/apt/archives \
    --volume /var/run/docker.sock:/var/run/docker.sock \
    --volume /usr/bin/docker:/usr/bin/docker:ro \
    $img_src << EOF
set -ex
groupadd --gid ${docker_gid} docker

mv /etc/apt/apt.conf.d/docker-clean /root; \
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' \
    > /etc/apt/apt.conf.d/10apt-keep-downloads

export DEBIAN_FRONTEND=noninteractive
apt update
apt -y install apt-utils
apt -y upgrade
# This probably isn't really needed, but just in case...
sync
docker commit \$HOSTNAME $img_dst
EOF

Clean up

To achieve a smaller image, the final stages of your build should undo changes to /etc/apt.conf.d, but when /var/cache/apt/archives is not mounted.

docker build --tag <dst_img> - << EOF
FROM <src_img>
RUN \
set -ex; \
mv /root/docker-clean /etc/apt/apt.conf.d/; \
rm /etc/apt/apt.conf.d/10apt-keep-downloads; \
DEBIAN_FRONTEND=noninteractive apt clean
EOF

Upvotes: 2

John Hasty

Reputation: 1

On Linux to mount a USB drive inside the container for data storage do the following.

sudo fdisk -l

sudo mkdir /mnt/usb

sudo mount /dev/sda2 /mnt/usb Before doing this you must unmount the drive from OS file system and the /dev/id will be different on your machine, then

ls /mnt/usb

cd /mnt/usb ls

sudo umount /mnt/usb

sudo mount /dev/sda2 /mnt/usb

docker run -i -t -v /mnt/usb:/opt/usb bin/bash This puts the USB Drive into the container at /mint/usb

Upvotes: -2

jrbe228

Reputation: 578

Since BuildKit is still not supported for Windows containers, one alternative is using a network share. This creates a minimal Docker image size because installers are added/removed from the container during a single RUN statement.

Place your EXE / MSI files in a folder on the host machine
Share the folder
Start Docker build
Map the share from within the Docker container
Copy + run + delete each installer
Remove mapping within container
End Docker build
Un-share folder on host machine

Here is a working example for creating a Java 11 Windows container:

build.cmd:

host=%COMPUTERNAME%
set domainAndUser=whoami
set "pswd=<host-password>"
set "shareName=tmpSmbShare"
set netPath=\\%COMPUTERNAME%\%shareName%
set "localPath=C:/Users/tester/Desktop/docker/Java11/winsrvcore/installers"

powershell New-SmbShare -Name %shareName% -Path %localPath% -FullAccess "Everyone"
docker build ^
--build-arg domainAndUser=%domainAndUser% ^
--build-arg pswd=%pswd% ^
--build-arg netPath=%netPath% ^
-t <docker-username>/java-11.0.16-winsrvcore.ltsc2019:1.0 .
powershell Remove-SmbShare -Name %shareName% -Force

Dockerfile:

FROM mcr.microsoft.com/windows/servercore:ltsc2019-amd64

ARG domainAndUser
ARG pswd
ARG netPath

RUN powershell New-SMBMapping -LocalPath "R:" -RemotePath "$env:netPath" -UserName "$env:domainAndUser" -Password "$env:pswd" \
    && dir "R:/" \
    && copy "R:/jdk-11.0.16_windows-x64_bin.exe" "C:/" \
    && powershell Start-Process -filepath 'C:/jdk-11.0.16_windows-x64_bin.exe' -Wait -PassThru -ArgumentList "/s,/L,install64.log" \
    && powershell Remove-SMBMapping -LocalPath "R:" -Force \
    && del "C:/jdk-11.0.16_windows-x64_bin.exe" 

ENV JAVA_HOME "C:\Program Files\Java\jdk-11.0.16"

Upvotes: 3

Arjan

Reputation: 23559

git clone from a private repo within docker using my private ssh key

With BuildKit when creating Linux images, docker build also provides SSH agent forwarding, using the --ssh flag. As documented in Docker's Using SSH to access private data in builds, a Dockerfile can then use --mount=type=ssh in a RUN command, to delegate SSH authentication to the host's SSH agent:

# syntax=docker/dockerfile:1
FROM alpine
RUN apk add --no-cache openssh-client git

# Download public key of remote server at github.com 
RUN mkdir -p -m 0700 ~/.ssh && ssh-keyscan github.com >> ~/.ssh/known_hosts

# Enable verbose output (optional)
ENV GIT_SSH_COMMAND='ssh -Tvv'

# Clone using private keys known to SSH agent on the host
RUN --mount=type=ssh git clone [email protected]:myorg/myproject.git myproject

At build time, the above can then basically use the host's SSH keys by simply running:

docker build --ssh default .

On the host, this needs some configuration using ssh-add, as explained in the documentation linked above; if all is well then one of echo $SSH_AGENT_SOCK or echo $SSH_AUTH_SOCK should give you some output, and ssh-add -L should show the identities available to the above.

Some possible errors when using verbose logging:

Host key verification failed: you did not add the remote host to the image's ~/.ssh/known_hosts
pubkey_prepare: ssh_get_authentication_socket: No such file or directory: you forgot to include the --ssh default argument in docker build
pubkey_prepare: ssh_get_authentication_socket: Permission denied: are you using some USER <username>? Then specify its user id using --mount=type=ssh,uid=<uid>, or open the socket to all users, using --mount=type=ssh,mode=0666

This also works with PIP/Conda/Mamba to install Python dependencies directly from version control, like git+ssh://[email protected]/myorg/myproject.git@mybranch#egg=myproject:

RUN --mount=type=ssh mamba env create -n myenv --file conda_environment.yml

Upvotes: 5

orby

Reputation: 352

If you are looking for a way to "mount" files, like -v for docker run, you can now use the --secret flag for docker build

echo 'WARMACHINEROX' > mysecret.txt
docker build --secret id=mysecret,src=mysecret.txt .

And inside your Dockerfile you can now access this secret

# syntax = docker/dockerfile:1.0-experimental
FROM alpine

# shows secret from default secret location:
RUN --mount=type=secret,id=mysecret cat /run/secrets/mysecret

# shows secret from custom secret location:
RUN --mount=type=secret,id=mysecret,dst=/foobar cat /foobar

More in-depth information about --secret available on Docker Docs

Upvotes: 4

Yegor

Reputation: 3950

As many have already answered, mounting host volumes during the build is not possible. I just would like to add docker-compose way, I think it'll be nice to have, mostly for development/testing usage

Dockerfile

FROM node:10
WORKDIR /app
COPY . .
RUN npm ci
CMD sleep 999999999

docker-compose.yml

version: '3'
services:
  test-service:
    image: test/image
    build:
      context: .
      dockerfile: Dockerfile
    container_name: test
    volumes:
      - ./export:/app/export
      - ./build:/app/build

And run your container by docker-compose up -d --build

Upvotes: 6

BMitch

Reputation: 264841

First, to answer "why doesn't VOLUME work?" When you define a VOLUME in the Dockerfile, you can only define the target, not the source of the volume. During the build, you will only get an anonymous volume from this. That anonymous volume will be mounted at every RUN command, prepopulated with the contents of the image, and then discarded at the end of the RUN command. Only changes to the container are saved, not changes to the volume.

Since this question has been asked, a few features have been released that may help. First is multistage builds allowing you to build a disk space inefficient first stage, and copy just the needed output to the final stage that you ship. And the second feature is Buildkit which is dramatically changing how images are built and new capabilities are being added to the build.

For a multi-stage build, you would have multiple FROM lines, each one starting the creation of a separate image. Only the last image is tagged by default, but you can copy files from previous stages. The standard use is to have a compiler environment to build a binary or other application artifact, and a runtime environment as the second stage that copies over that artifact. You could have:

FROM debian:sid as builder
COPY export /export
RUN compile command here >/result.bin

FROM debian:sid
COPY --from=builder /result.bin /result.bin
CMD ["/result.bin"]

That would result in a build that only contains the resulting binary, and not the full /export directory.

Buildkit is coming out of experimental in 18.09. It's a complete redesign of the build process, including the ability to change the frontend parser. One of those parser changes has has implemented the RUN --mount option which lets you mount a cache directory for your run commands. E.g. here's one that mounts some of the debian directories (with a reconfigure of the debian image, this could speed up reinstalls of packages):

# syntax = docker/dockerfile:experimental
FROM debian:latest
RUN --mount=target=/var/lib/apt/lists,type=cache \
    --mount=target=/var/cache/apt,type=cache \
    apt-get update \
 && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
      git

You would adjust the cache directory for whatever application cache you have, e.g. $HOME/.m2 for maven, or /root/.cache for golang.

TL;DR: Answer is here: With that RUN --mount syntax, you can also bind mount read-only directories from the build-context. The folder must exist in the build context, and it is not mapped back to the host or the build client:

# syntax = docker/dockerfile:experimental
FROM debian:latest
RUN --mount=target=/export,type=bind,source=export \
    process export directory here...

Note that because the directory is mounted from the context, it's also mounted read-only, and you cannot push changes back to the host or client. When you build, you'll want an 18.09 or newer install and enable buildkit with export DOCKER_BUILDKIT=1.

If you get an error that the mount flag isn't supported, that indicates that you either didn't enable buildkit with the above variable, or that you didn't enable the experimental syntax with the syntax line at the top of the Dockerfile before any other lines, including comments. Note that the variable to toggle buildkit will only work if your docker install has buildkit support built in, which requires version 18.09 or newer from Docker, both on the client and server.

Upvotes: 96

Akom

Reputation: 1641

Here is a simplified version of the 2-step approach using build and commit, without shell scripts. It involves:

Building the image partially, without volumes
Running a container with volumes, making changes, then committing the result, replacing the original image name.

With relatively minor changes the additional step adds only a few seconds to the build time.

Basically:

docker build -t image-name . # your normal docker build

# Now run a command in a throwaway container that uses volumes and makes changes:
docker run -v /some:/volume --name temp-container image-name /some/post-configure/command

# Replace the original image with the result:
# (reverting CMD to whatever it was, otherwise it will be set to /some/post-configure/command)   
docker commit --change="CMD bash" temp-container image-name 

# Delete the temporary container:
docker rm temp-container

In my use case I want to pre-generate a maven toolchains.xml file, but my many JDK installations are on a volume that isn't available until runtime. Some of my images are not compatible with all the JDKS, so I need to test compatibility at build time and populate toolchains.xml conditionally. Note that I don't need the image to be portable, I'm not publishing it to Docker Hub.

Upvotes: 4

Keith Mason

Reputation: 239

There is a way to mount a volume during a build, but it doesn't involve Dockerfiles.

The technique would be to create a container from whatever base you wanted to use (mounting your volume(s) in the container with the -v option), run a shell script to do your image building work, then commit the container as an image when done.

Not only will this leave out the excess files you don't want (this is good for secure files as well, like SSH files), it also creates a single image. It has downsides: the commit command doesn't support all of the Dockerfile instructions, and it doesn't let you pick up when you left off if you need to edit your build script.

UPDATE:

For example,

CONTAINER_ID=$(docker run -dit ubuntu:16.04)
docker cp build.sh $CONTAINER_ID:/build.sh
docker exec -t $CONTAINER_ID /bin/sh -c '/bin/sh /build.sh'
docker commit $CONTAINER_ID $REPO:$TAG
docker stop $CONTAINER_ID

Upvotes: 22

xpt

Reputation: 23054

UPDATE: Somebody just won't take no as the answer, and I like it, very much, especially to this particular question.

GOOD NEWS, There is a way now --

The solution is Rocker: https://github.com/grammarly/rocker

John Yani said, "IMO, it solves all the weak points of Dockerfile, making it suitable for development."

Rocker

https://github.com/grammarly/rocker

By introducing new commands, Rocker aims to solve the following use cases, which are painful with plain Docker:

Mount reusable volumes on build stage, so dependency management tools may use cache between builds.

Share ssh keys with build (for pulling private repos, etc.), while not leaving them in the resulting image.

Build and run application in different images, be able to easily pass an artifact from one image to another, ideally have this logic in a single Dockerfile.

Tag/Push images right from Dockerfiles.

Pass variables from shell build command so they can be substituted to a Dockerfile.

And more. These are the most critical issues that were blocking our adoption of Docker at Grammarly.

Update: Rocker has been discontinued, per the official project repo on Github

As of early 2018, the container ecosystem is much more mature than it was three years ago when this project was initiated. Now, some of the critical and outstanding features of rocker can be easily covered by docker build or other well-supported tools, though some features do remain unique to rocker. See https://github.com/grammarly/rocker/issues/199 for more details.

Upvotes: 68

MatrixManAtYrService

Reputation: 9201

It's ugly, but I achieved a semblance of this like so:

Dockerfile:

FROM foo
COPY ./m2/ /root/.m2
RUN stuff

imageBuild.sh:

docker build . -t barImage
container="$(docker run -d barImage)"
rm -rf ./m2
docker cp "$container:/root/.m2" ./m2
docker rm -f "$container"

I have a java build that downloads the universe into /root/.m2, and did so every single time. imageBuild.sh copies the contents of that folder onto the host after the build, and Dockerfile copies them back into the image for the next build.

This is something like how a volume would work (i.e. it persists between builds).

Upvotes: 9

nealmcb

Reputation: 13481

I think you can do what you want to do by running the build via a docker command which itself is run inside a docker container. See Docker can now run within Docker | Docker Blog. A technique like this, but which actually accessed the outer docker from with a container, was used, e.g., while exploring how to Create the smallest possible Docker container | Xebia Blog.

Another relevant article is Optimizing Docker Images | CenturyLink Labs, which explains that if you do end up downloading stuff during a build, you can avoid having space wasted by it in the final image by downloading, building and deleting the download all in one RUN step.

Upvotes: 5

Andreas Steffan

Reputation: 6159

It is not possible to use the VOLUME instruction to tell docker what to mount. That would seriously break portability. This instruction tells docker that content in those directories does not go in images and can be accessed from other containers using the --volumes-from command line parameter. You have to run the container using -v /path/on/host:/path/in/container to access directories from the host.

Mounting host volumes during build is not possible. There is no privileged build and mounting the host would also seriously degrade portability. You might want to try using wget or curl to download whatever you need for the build and put it in place.

Upvotes: 137

Behe

Reputation: 7940

As you run the container, a directory on your host is created and mounted into the container. You can find out what directory this is with

$ docker inspect --format "{{ .Volumes }}" <ID>
map[/export:/var/lib/docker/vfs/dir/<VOLUME ID...>]

If you want to mount a directory from your host inside your container, you have to use the -v parameter and specify the directory. In your case this would be:

docker run -v /export:/export data

SO you would use the hosts folder inside your container.

Upvotes: 8