Farzad
Farzad

Reputation: 3438

Why using "volatile" keyword for shared memory is not possible when atomic operations are done on shared memory?

I have a piece of CUDA code in which threads are performing atomic operations on shared memory. I was thinking since the result of atomic operation will be visible to other threads of the block instantly anyways, it might be good to instruct the compiler to have the shared memory volatile.
So I changed

__global__ void CoalescedAtomicOnSharedMem(int* data, uint nElem)
{
    __shared__ int smem_data[BLOCK_SIZE];
    uint tid = (blockIdx.x * blockDim.x) + threadIdx.x;
    for ( uint i = tid; i < nElem; i += blockDim.x*gridDim.x){
        atomicAdd( smem_data+threadIdx.x, 6);
    }
}

to

__global__ void volShared_CoalescedAtomicOnSharedMem(int* data, uint nElem)
{
    volatile __shared__ int smem_data[BLOCK_SIZE];
    uint tid = (blockIdx.x * blockDim.x) + threadIdx.x;
    for ( uint i = tid; i < nElem; i += blockDim.x*gridDim.x){
        atomicAdd( smem_data+threadIdx.x, 6);
    }
}

Below compile-time error happens having above change:

error: no instance of overloaded function "atomicAdd" matches the argument list
        argument types are: (volatile int *, int)

Why isn't a volatile address supported as an argument for atomic operations? Is it because compiler already treats the shared memory as volatile as soon as it identifies that there's going to be atomic operations on it?

Upvotes: 9

Views: 4471

Answers (2)

user3344003
user3344003

Reputation: 21617

The previous poster has correctly identified the problem: There is no atomicAdd function defined that takes a volatile parameter.

Your question as to why this is the case, my guess is that your library developers simply omitted that interface. Imagine all the combinations of volatile, const, and possible parameters and the number of potential interfaces starts to explode.

Why isn't a volatile address supported as an argument for atomic operations?

Atomic operations are not part of C/C++. In your case, they are being implemented in a library that is probably implemented in assembly language.

Is it because compiler already treats the shared memory as volatile as soon as it identifies there's going to be atomic operations on it?

No, this is they way the library writer has defined the function interface.

Upvotes: -1

Robert Crovella
Robert Crovella

Reputation: 151879

The definition of the volatile qualifier is given in the programming guide. It instructs the compiler to always generate a read or write for that access, and never "optimize" it into a register or some other optimization.

Since atomic operations are guaranteed to act on actual memory locations (either shared or global) the combination of the two is unnecessary. Therefore, versions of atomic functions prototyped for volatile qualifier are not provided.

If you have a memory location that is already declared as volatile, simply cast it to the corresponding non-volatile type when you pass the address to your atomic function. The behavior will be as expected.(example)

Therefore, atomic operations can operate on locations specified as volatile with this proviso.

The simple fact that you have accessed a particular location using atomics somewhere in your code does not mean that the compiler will treat every access elsewhere as implicitly volatile. If you need volatile behavior elsewhere, declare it explicitly.

Upvotes: 8

Related Questions