Reputation: 233
I am developing a basic ray tracer using OpenCL / OpenGL Interop. I am having some issues with a kernel which shared local memory shared within a workgroup.
Here is the kernel:
__kernel void ComputeDirectionalShadowTexture(
write_only image2d_t shadowTexture,
read_only image2d_t positionTexture,
__constant float3* lightDirection, __constant float4* spheres,
)
{
__local bool* shadowReduce[2];
__local size_t idX, idY, idZ, localID;
idX = get_global_id(0);
idY = get_global_id(1);
idZ = get_global_id(2);
localID = get_local_id(2);
//...Read Textures
//...Perform Computation
//...Write results
if(shadowReduce[localID])
write_imagef(shadowTexture, threadCoord.xy, (float4)(1.0f, 0.0f, 0.0f, 1.0f));
}
When running this, it is as if the get_local_id() function is never returning 0 (or only returning 1).
I would expect the problem to be related to how I am invoking the kernel:
size_t numGlobal[3] =
{
rBuffer->textureWidth,
rBuffer->textureHeight,
numSpheres
};
size_t numLocal[3] = { 1, 1, numSpheres};
cl_event execution;
//Execute kernel
clError = clEnqueueNDRangeKernel
(
buffer->clQueue,
members->directionalShadowKernel,
3,
NULL,
&numGlobal,
&numLocal,
numBeforeExecution,
completeBeforeExecution,
&execution
);
Where numSpheres
is a constant set to 2
.
Any/all feedback is appreciated.
Upvotes: 2
Views: 1130
Reputation: 233
I made a rookie mistake in the above code, if anybody ever has this problem please make sure you are not assigning the result of get_local_id()
to a __local
access qualified variable as I do here:
localID = get_local_id(2);
Of course the local variable gets overwritten by each thread in the work group, because the local address space is shared across a work group.
So instead of declaring localID
as:
__local size_t localID;
it should be declared as:
size_t localID;
Hope this helps somebody.
Upvotes: 5