Reputation: 435
I've question is related with declare target construct of OpenMP and dynamic/nested parallelism feature of GPUs. OpenACC 2.0 supports dynamic parallelism in two ways; routine directive and using parallel/kernels directives in nested way. But using nested device directives such as target, teams or distribute is prohibited in OpenMP. But declare target directive seems to me similar with routine directive of OpenACC.
However it's not still clear semantic and its usage. My sample code is like this.
#pragma omp declare target
void foo(){
//work-1
#pragma omp target teams distribute parallel for
for(...)
{
if(...){
foo();
{
//work-2
}
}
//work-3
}
#pragma omp end declare target
int main(){
//work
foo();
}
In that example, I am expecting that function foo will be mapped as device function. but since it has target construct inside, compiler will transform this code in some way. My question is here, what will happen when the threads encounters foo function invocation inside of target construct?
Upvotes: 0
Views: 430
Reputation: 36
Currently OpenMP does not support nesting of target regions. So your code will most likely not compile. Or it can crash at runtime when nested target construct encountered. Or produce unexpected result. Basically the result of non-conforming program is unspecified.
Upvotes: 1