Richard
Richard

Reputation: 61399

Debugging an "Invalid address space" error

I've built some C++ code that uses OpenACC and compiled it with the PGI compiler for use on the Tesla GPU.

Compilation succeeds without any warnings.

I run the program and get two errors:

call to cuStreamSynchronize returned error 717: Invalid address space
call to cuMemFreeHost returned error 717: Invalid address space

The internet doesn't seem to know much about this, other than to suggest enabling unified memory so that the problem is automatically swept under the rug. I'm not into that kind of solution.

How do I go about debugging this?

With C++ code running only on the CPU, I'd fire up gdb, do a backtrace, and say, "Ah ha!"

But now I have code living on the CPU and the GPU and data flowing between the two. I don't even know what tools to use.

A fallback is to start commenting out lines until the problem goes away, but that seems suboptimal too.

Upvotes: 1

Views: 923

Answers (2)

Richard
Richard

Reputation: 61399

There are some helpful environment variables that aid in debugging. Any combination can be enabled:

export PGI_ACC_TIME=1   #Profile time usage
export PGI_ACC_NOTIFY=1 #Set to values 0-3 where 3 is the most detailed
export PGI_ACC_DEBUG=1  #Extra debugging info

Upvotes: 0

Mat Colgrove
Mat Colgrove

Reputation: 5646

You can use "cuda-gdb" to debug the device code or use "cuda-memcheck" to check for memory errors.

Though I'm not sure either will help here. The error is indicating that the device code is issuing an instruction using an address from the wrong memory space. For example, using a shared memory pointer with an instruction that expects a global memory pointer.

I have not seen this error before nor do I see any previous bug reports for it, so can only theorize as to the cause. One possibility is if you have a shared memory variable (scalar or array in a "private" clause, or "cache" directive) that's passed from a outer gang loop to a vector routine. In this case, the vector routine may be accessing the variable as if it's in global memory.

Most likely whatever the cause, it's a compiler error. If possible, please post or send to PGI Customer Service ([email protected]) a reproducing example and I'll get it to our compiler engineers for investigation.

I can also try to get you a work-around once I better understand the cause. Though in the meantime you can try compiling with "-ta=tesla:nollvm,keepgpu". "nollvm" will cause the compiler to generate an intermediary CUDA C version of the OpenACC kernels as opposed to the default LLVM device code generator. "keepgpu" will keep the intermediary ".gpu" file which you can inspect.

Upvotes: 1

Related Questions