Reputation: 131547
I'm launching a CUDA kernel I've compiled, using the cudLaunchKernel()
driver API function. I'm passing my parameters in a kernelParams
array, and passing nullptr
for the extra
argument.
Unfortunately, this fails, with the error: CUDA_ERROR_INVALID_HANDLE
. Why? I checked the Driver API documentation to see how the function might fail in what cases, and edit it discusses the failure with CUDA_ERROR_INVALID_VALUE
(not the same thing). It doesn't discuss the error I get.
Since there is more than one parameter to cuLaunchKernel()
which is some sort of a handle - what does this failure mean? (And if there are multiple options - what are they?)
Upvotes: 3
Views: 3194
Reputation: 1
I got same error, downgraded python to 3.8 and installed tensorflow again. It works now.
Upvotes: 0
Reputation: 1
cuobjdump -symbols myModule.cubin to check whether your function's name had been changed, if so, then add the extern "C" before your device function
Upvotes: -1
Reputation: 131547
One possibility is a failure due to a CUDA driver context switch. You may have inadvertently performed some action which pushes or replaces the current context for the CUDA device; and loaded modules are part of context - so your compiled and loaded kernel can no longer be loaded in the current context. This triggers a CUDA_ERROR_INVALID_HANDLE
failure.
Assuming this is the case, switch the context before the launch, e.g. this way:
cuCtxPushCurrent(my_driver_context);
cuLaunchKernel(/*etc. etc. */);
/* possibly */ cuCtxPopCurrent(NULL);
or like so:
cuCtxSetCurrent(my_driver_context);
cuLaunchKernel(/*etc. etc. */);
Note that you may be risking memory leaks, if you pop and ignore the only reference to a valid context; and you may also risk some other code assuming that the context it has put in place is still the active one.
Upvotes: 3
Reputation: 5723
Well, in my case it was an OOM error (Out of Memory) error which for some reason was not reported as such. When I reduced the batch size of my model it worked. Maybe you should check if this is the case also.
Upvotes: -1