This is a section of C++ develoepd muti-gpu code. I am trying the CPU version where OPENACC=0 #if (OPENACC==1) #pragma acc routine #endif void myCass::method( int i, int j, int dir, int index ) { #if (OPENACC==1) double Sn[ZSIZE]; #else double *Sn=new double[ZSIZE] (double *Sn=()malloc(ZSIZE)) #endif } The following method gives the compiler error "PGCC-S-1000-Call in OpenACC region to procedure '_Znam' which has no acc routine information" but if I replace the "new" with C-style allocation (i.e. malloc ) in compiles fine. Is this something to be expected? I use PGI version 18.1 Is it safe to use a large private variable such as Sn ?

Reputation: 749

PGI compiler issue in routine directive

This is a section of C++ develoepd muti-gpu code. I am trying the CPU version where OPENACC=0

#if (OPENACC==1)
    #pragma acc routine
#endif
        void myCass::method( int i, int j, int dir, int index )
         {     
           #if (OPENACC==1)
           double Sn[ZSIZE];
           #else 
           double *Sn=new double[ZSIZE] (double *Sn=()malloc(ZSIZE))
           #endif
        }

The following method gives the compiler error "PGCC-S-1000-Call in OpenACC region to procedure '_Znam' which has no acc routine information" but if I replace the "new" with C-style allocation (i.e. malloc ) in compiles fine. Is this something to be expected? I use PGI version 18.1
Is it safe to use a large private variable such as Sn ?

Upvotes: 0

Answers (2)

Mat Colgrove

Reputation: 5646

"_Znam" is the mangled name for the new operator which doesn't have a device side equivalent. However, there is a CUDA device malloc routine and why it works.

Though, I would highly recommend you not perform device side dynamic allocation. Besides being very slow, the device heap is very small (default is around 8MB). While you can increase this by setting the PGI environment variable "PGI_ACC_CUDA_HEAPSIZE", the max is still only around 32MB. (Note that I haven't tested the max device heap size in awhile so this may have increased but I'd still not recommended doing device side allocation.)

Also, yes, if the level of parallelism is not specified, PGI will default to using "seq".

Upvotes: 1

jefflarkin

Reputation: 1279

You need to give a level of parallelism on your acc routine pragma. It looks like the correct pragma in your case would be acc routine seq since it contains no loops to parallelize.

Upvotes: 1

PGI compiler issue in routine directive

Answers (2)

Related Questions