Reputation: 3769
I have a for
loop and I want to parallelize it with OpenACC if the target hardware is NVIDIA, or run it serially when the target hardware is AMD. I tried the following:
#pragma acc loop \
device_type(tesla) parallel \
device_type(radeon) seq
for (int z = 0; z < size_z; ++z)
{
// do stuff...
}
Compiled with: pgc++ -std=c++11 -O4 -ta=tesla -Minfo:accel main.cpp
But on the parallelization report I get: <line_number>, #pragma acc loop seq
It appears that the compiler only takes into account the last line of the directive. Any idea why is this happening?
Running pgc++ --version
gives the following:
pgc++ 16.10-0 64-bit target on x86-64 Linux -tp sandybridge
Upvotes: 0
Views: 328
Reputation: 5646
You're using "device_type" correctly but we (PGI) are still missing a few OpenACC features including defining multiple loop schedules via the "device_type" clause. The current limitations are listed in section 4.4 of the PGI release notes: http://www.pgroup.com/doc/pgirn-x64.pdf
Upvotes: 1