user3731622
user3731622

Reputation: 5095

How to use CUDA_VISIBLE_DEVICES to set --gpus argument of docker run cmd?

The docker run cmd docs show an example of how to specify several (but not all) gpus:

docker run -it --rm --gpus '"device=0,2"' nvidia-smi

I'd like to set the --gpus to use those indicated by the environment variable CUDA_VISIBLE_DEVICES.

I tried the obvious

docker run --rm -it --env CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES --gpus '"device=$CUDA_VISIBLE_DEVICES"' some_repo:some_tag /bin/bash

But this gives the error:

docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: device error: $CUDA_VISIBLE_DEVICES: unknown device: unknown.

Note: currently CUDA_VISIBLE_DEVICES=0,1

I saw a github issue about this, but the solution is a bit messy and didn't work for me.

What is a good way to use CUDA_VISIBLE_DEVICES to set --gpus argument of docker run cmd?

Upvotes: 2

Views: 4723

Answers (1)

jkr
jkr

Reputation: 19250

The single quotes in '"device=$CUDA_VISIBLE_DEVICES"' prevent the expansion of the variable. Try without the single quotes.

docker run --rm -it \
  --env CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES \
  --gpus device=$CUDA_VISIBLE_DEVICES some_repo:some_tag /bin/bash

Upvotes: 2

Related Questions