Reputation: 1307
I want to input some values into the output array in opencl kernel based on a condition. So I want to increment the index of the array after every value input into the array. Since the condition needs to be satisfied, the output array index is unknown. I am using the output array index as an argument:
__kernel void match(__global float* input, __global float* output, int pos)
{
if(condition)
{
output[pos] = input[get_global_id(0)];
atomic_inc(&pos); // this is where the problem lies
}
}
I also tried to give the pos as an array
__kernel void match(__global float* input, __global float* output, __global int* pos)
{
if(condition)
{
output[pos[0]] = input[get_global_id(0)];
atomic_inc(pos[0]); // this is where the problem lies
}
}
The clBuildProgram returned with an error code -11 for both the cases. It worked when I incremented the value pos++ but that didnt return any final value of the position of the array.
Can anyone explain what am I doing wrong?
Upvotes: 0
Views: 2073
Reputation: 6333
You can't directly use the value of the variable that you increment using atomic_inc or you will have race conditions. The documentation for atomic_inc mentions that it returns the old value before the increment, and if each thread uses it this way, they will each get a unique value. So the correct way to use it is:
int localPos = atomic_inc(&pos);
output[localPos] = input[get_global_id(0)];
"pos" can be a global or a local, but it seems like it should be global for your usage.
Upvotes: 0
Reputation: 1020
Not sure if I understand the question, but let's give it a shot:
Is each element in input
assigned a thread? If so, input
would be indexed using index[get_global_id(0)]
in the kernel assuming (huge assumption) you are using a 1D array and called clEnqueuNDRangeKernel()
with a global work size similar to size_t Global_Work_Size[1] = {input_size}
When calling kernels similar to the first example with int pos
, this puts a constant of pos
in EVERY thread, so it won't work as I'm interpreting your question.
If the kernels indexes don't map in an easy way, the index needs to be computed on the fly or another array needs to be input that is a Look-Up Table (LUT) of indexes that maps input
to output
.
Finally, you can use clGetProgramBuildInfo to find out exactly what the error is. See the write-up I did in another thread.
Upvotes: 2