kirikoumath
kirikoumath

Reputation: 733

PGI openACC: target specific gpu device

I have two NVIDIA card:

$ ls /dev/nv*
/dev/nvidia0  /dev/nvidia1  /dev/nvidiactl  /dev/nvidia-uvm

using pgcc, how do I target a specific card? How I make sure the code is generate for nvidia0 (device=0) or nvidia1 (device=1)?

Thank you in advance for your help.

Upvotes: 1

Views: 1875

Answers (2)

Mat Colgrove
Mat Colgrove

Reputation: 5646

There's also the OpenACC environment variable "ACC_DEVICE_NUM" which can be used to set the device number to use.

Upvotes: 3

Robert Crovella
Robert Crovella

Reputation: 152143

The OpenACC API routine to target a particular device is:

acc_set_device_num( i, acc_device_nvidia );

If you call this once, at the beginning of your program, with i set to the device ordinal you wish to use, then you can target that device programmatically.

However, depending on your use case, you may find it easier just to write your code without such API routines, but instead use the CUDA_VISIBLE_DEVICES environment variable. For example, you could do:

CUDA_VISIBLE_DEVICES="0" ./my_app

to run your code on device 0, or

CUDA_VISIBLE_DEVICES="1" ./my_app

to run the same code on device 1.

To make sure the code is generated for a specific device type, you would append the compute capability of that device to the -ta switch during compilation, for example:

pgcc -ta=tesla:cc30 ...

would generate code for a cc3.0 device. If you use the command line help for pgcc:

pgcc -help

it will list the other supported options for this. For example my pgcc (15.7) shows:

...
-ta=tesla:{cc20|cc30|cc35|cc50|cuda6.5|cuda7.0|fastmath|[no]flushz|[no]fma|keepbin|keepgpu|keepptx|[no]lineinfo|[no]llvm|loadcache:{L1|L2}|maxregcount:<n>|pin|[no]rdc|[no]unroll|beta}|nvidia|radeon:{keep|[no]llvm|[no]unroll|tahiti|capeverde|spectre|buffercount:<n>}|host
                    Choose target accelerator
    tesla           Select NVIDIA Tesla accelerator target
     cc20           Compile for compute capability 2.0
     cc30           Compile for compute capability 3.0
     cc35           Compile for compute capability 3.5
     cc50           Compile for compute capability 5.0
...

Upvotes: 3

Related Questions