user2005893
user2005893

Reputation: 117

overuse of __syncthread in the code

I understand the purpose of __syncthreads(), but I sometimes find it overused in some codes.

For instance, in the code below taken from NVIDIA notes, each thread calculates mainly s_data[tx]-s_data[tx-1]. Each thread needs the data it reads from the global memory and the data read by its neighboring thread. Both threads will be in the same warp and hence should complete retrieval of their data from the global memory and are scheduled for execution simultaneously.

I believe the code will still work without __syncthread(), but obviously the NVIDIA notes say otherwise. Any comment, please?

// Example – shared variables
// optimized version of adjacent difference
__global__ void adj_diff(int *result, int *input)
{
    // shorthand for threadIdx.x
    int tx = threadIdx.x;
    // allocate a __shared__ array, one element per thread
    __shared__ int s_data[BLOCK_SIZE];
    // each thread reads one element to s_data
    unsigned int i = blockDim.x * blockIdx.x + tx;
    s_data[tx] = input[i];
    // avoid race condition: ensure all loads
    // complete before continuing
    __syncthreads();

    if(tx > 0)
        result[i] = s_data[tx] – s_data[tx–1];
    else if(i > 0)
    {
        // handle thread block boundary
        result[i] = s_data[tx] – input[i-1];
    }
}

Upvotes: 0

Views: 158

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 152174

It would be nice if you included a link to where, in the "Nvidia notes", this appeared.

both threads will be in the same warp

No, they won't, at least not in all cases. What happens when tx = 32? Then the thread corresponding to tx belongs to warp 1 in the block, and the thread corresponding to tx-1 belongs to warp 0 in the block.

There's no guarantee that warp 0 has executed before warp 1, so the code could fail without the call to __synchtreads() (since, without it, the value of s_data[tx-1] could be invalid, since warp 0 hasn't run and therefore hasn't loaded it yet.)

Upvotes: 5

Related Questions