Correct use of device_type in OpenACC

Question

I have a for loop and I want to parallelize it with OpenACC if the target hardware is NVIDIA, or run it serially when the target hardware is AMD. I tried the following:

#pragma acc loop \
    device_type(tesla) parallel \
    device_type(radeon) seq
for (int z = 0; z < size_z; ++z)
{
    // do stuff...
}

Compiled with: pgc++ -std=c++11 -O4 -ta=tesla -Minfo:accel main.cpp

But on the parallelization report I get: , #pragma acc loop seq

It appears that the compiler only takes into account the last line of the directive. Any idea why is this happening?

Running pgc++ --version gives the following:

pgc++ 16.10-0 64-bit target on x86-64 Linux -tp sandybridge

Mat Colgrove · Accepted Answer

You're using "device_type" correctly but we (PGI) are still missing a few OpenACC features including defining multiple loop schedules via the "device_type" clause. The current limitations are listed in section 4.4 of the PGI release notes: http://www.pgroup.com/doc/pgirn-x64.pdf

Correct use of device_type in OpenACC

Answers (1)

Related Questions