starter
starter

Reputation: 325

cuda: write to same global memory location by several threads

i have a kernel where several threads will be writing to the same array location, let's say array[i], located in global memory. other related questions here in SO gave as an answer the use of atomics and other things. but no answer shows the actual cuda code. can anyone show a cuda code how array[i], i.e. array's location at index i, would be written by several threads atomically. thanks!

Upvotes: 1

Views: 1021

Answers (1)

Greg Smith
Greg Smith

Reputation: 11549

CUDA provides compiler intrinsics for atomic operations. See the CUDA C Programming Guide for additional details on what atomic operations are available for each compute capability. counters is a pointer to an array of integers of size gridDim.x. Each thread will increment the array value indexed by it's blockIdx.x.

__global__ void CountThreadsInBlock(int* counters)
{
    int i = blockIdx.x;
    atomicAdd(&counters[i], 1);
}

// NOTE: Assume 1D launch.

Upvotes: 2

Related Questions