Reputation: 1
Am trying to build an index structure in the kernel code:
atomicCAS((int*)&index[val], -1, atomicAdd((unsigned int*)&index_pos, 1));
index[] is declared as dynamic shared memory array and initialized to with -1, index_pos is declared as volatile.
The intuition is the following: only the first thread in the block should initialize index and increment index_pos. However I have noticed that index_pos is incremented multiple times by conflicting threads. Why is this happening?
Upvotes: 0
Views: 502
Reputation: 1507
I was unable to understand what your code is supposed to do, however, I don't see the reason why the variable index_pos
should not be incremented more-times. Nesting one atomic operation into another does not product composite atomic operation.
Example:
atomicAdd(atomicAdd(x, 1), 1);
does not act as
atomicAdd(x, 2);
but:
atomicAdd(x, 1);
atomicAdd(x, 1);
EDIT AFTER COMMENT:
Having information from your comment I would ensure described functionality by the following code:
if(index[val] == -1) { // this is just an optimization
atomicCAS((int*)&index[val], -1, threadId); // initialization by the only thread
}
__threadfence_block();
if(index[val] == threadId) {
index_pos++; //index_pos will be incremented only "once"
}
Upvotes: 1