Reputation: 2822
I'd like to figure out why I'm receiving the following error for an OpenCL kernel that I'm trying to run:
Context error: [CL_OUT_OF_RESOURCES] :
OpenCL Error : clEnqueueNDRangeKernel failed: local memory usage (16416 bytes) is more than available on the device (16384 bytes)
The kernel is defined as:
__kernel void kernelFun(__read_only image2d_t src,
__global __write_only uchar8 *dst,
__global uchar4 *endpointBuffer,
__local uchar4 *pixelBuffer)
{
...
}
And I'm allocating the local memory using the standard clSetKernelArg
routine:
clSetKernelArg(gKernel, 3, kPixelBufferBytes, NULL);
where kPixelBufferBytes
is equal to 16384.
My question is, where are these extra 32 bytes coming from?
Upvotes: 1
Views: 177
Reputation: 9925
Some OpenCL implementations are known to store kernel arguments using the same physical memory that is used for local memory. You have 32 bytes worth of kernel arguments, which would explain where this discrepancy is coming from.
For example, NVIDIA GPUs definitely used to do this (see page 25 of NVIDIA's original OpenCL best practices guide).
Upvotes: 1