Jon
Jon

Reputation: 1241

A single program appear on two GPU card

I have multiple GPU cards(NO.0, NO.1 ...), and every time I run a caffe process on NO.1 or 2 ... (except 0) card, it will use up 73MiB on the NO.0 card.

For example, in the fig below, process 11899 will use 73MiB on NO.0 card but it actually run on NO.1 card.

multi_gpus

Why? Can I disable this feature?

Upvotes: 1

Views: 142

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 152279

The CUDA driver is like an operating system. It will reserve memory for various purposes when it is active. Certain features, such as managed memory, may cause substantial side-effect allocations to occur (although I don't think this is the case with Caffe). And its even possible that the application itself may be doing some explicit allocations on those devices, for some reason.

If you want to prevent this, one option is to use the CUDA_VISIBLE_DEVICES environment variable when you launch your process.

For example, if you want to prevent CUDA from doing anything with card "0", you could do something like this (on linux):

CUDA_VISIBLE_DEVICES="1,2" ./my_application ...

Note that the enumeration used above (the CUDA enumeration) is the same enumeration that would be reported by the deviceQuery sample app, but not necessarily the same enumeration reported by nvidia-smi (the NVML enumeration). You may need to experiment or else run deviceQuery to determine which GPUs you want to use, and which you want to exclude.

Also note that using this option actually affects the devices that are visible to an application, and will cause a re-ordering of device enumeration (the device that was previously "1" will appear to be enumerated as device "0", for example). So if your application is multi-GPU aware, and you are selecting specific devices for use, you may need to change the specific devices you (or the application) are selecting, when you use this environment variable.

Upvotes: 1

Related Questions