Ty1er
Ty1er

Reputation: 1

Building an index using atomic operation in cuda

Am trying to build an index structure in the kernel code:

atomicCAS((int*)&index[val], -1, atomicAdd((unsigned int*)&index_pos, 1));

index[] is declared as dynamic shared memory array and initialized to with -1, index_pos is declared as volatile.

The intuition is the following: only the first thread in the block should initialize index and increment index_pos. However I have noticed that index_pos is incremented multiple times by conflicting threads. Why is this happening?

Upvotes: 0

Views: 502

Answers (1)

stuhlo
stuhlo

Reputation: 1507

I was unable to understand what your code is supposed to do, however, I don't see the reason why the variable index_pos should not be incremented more-times. Nesting one atomic operation into another does not product composite atomic operation.

Example:

atomicAdd(atomicAdd(x, 1), 1);

does not act as

atomicAdd(x, 2);

but:

atomicAdd(x, 1);
atomicAdd(x, 1);

EDIT AFTER COMMENT:

Having information from your comment I would ensure described functionality by the following code:

if(index[val] == -1) { // this is just an optimization
    atomicCAS((int*)&index[val], -1, threadId); // initialization by the only thread
}

__threadfence_block();

if(index[val] == threadId) {
    index_pos++; //index_pos will be incremented only "once" 
}

Upvotes: 1

Related Questions