Vickey
Vickey

Reputation: 91

How to make different threads execute different parts in CUDA?

I am working on CUDA and I have a problem related to thread synchronization. In my code I need threads to execute different parts of the code, like:

one thread -> 
all thread ->
one thread ->

This is what I want. In the initial part of code only one thread will execute and then some part will be executed by all threads then again single thread. Also the threads are executing in a loop. Can anyone tell me how to do that?

Upvotes: 1

Views: 629

Answers (3)

username_4567
username_4567

Reputation: 4903

If your program contains multiple blocks, you need to use custom synchronization mechanism across blocks. If your kernel launches only one block, then __syncthreads() will work.

Upvotes: 0

mch
mch

Reputation: 7373

You can only synchronize threads within a single blocks. It is possible to synchronize between multiple blocks, but only under very specific circumstances. If you need global synchronization between all threads, the way to do that is to launch a new kernel.

Within a block, you can synchronize threads using __syncthreads(). For example:

__global__ void F(float *A, int N)
{
    int idx = threadIdx.x + blockIdx.x * blockDim.x;

    if (threadIdx.x == 0) // thread 0 of each block does this:
    {
         // Whatever
    }
    __syncthreads();

    if (idx < N) // prevent buffer overruns
    {
        A[idx] = A[idx] * A[idx];  // "real work"
    }

    __syncthreads();

    if (threadIdx.x == 0) // thread 0 of each block does this:
    {
         // Whatever
    }
}

Upvotes: 2

Paul R
Paul R

Reputation: 212929

You need to use the thread ID to control what is executed, e.g.

if (thread_ID == 0)
{
  // do single thread stuff
}

// do common stuff on all threads

if (thread_ID == 0)
{
  // do single thread stuff
}

Upvotes: 0

Related Questions