algoProg
algoProg

Reputation: 728

CUDA: Selecting one thread per warp

I want to select only one thread per warp for a few operations.

For example, lets start with 1-D block dim of (64, 1, 1). As I understand, this will result in two warps considering the warp size is 32. In this case, I can use the following code to access one thread per warp:

if(threadIdx.x % 32 == 0) { ... }

First of all, does this make sense as I am not sure if we know how threads are mapped to the warp on hardware?

Secondly, how can be this achieved for 2-D block dim of (32, 32, 1)? Now here simple % 32 won't work as the thread indexing in two dimensions will be different?

Thanks.

Upvotes: 2

Views: 596

Answers (1)

RayaneCTX
RayaneCTX

Reputation: 593

Threads within a block are mapped to warps in a consistent way.

In section 4.1 of the CUDA programming guide:

The way a block is partitioned into warps is always the same; each warp contains threads of consecutive, increasing thread IDs with the first warp containing thread 0.

In section 2.2.1 of the CUDA programming guide:

The index of a thread and its thread ID relate to each other in a straightforward way: For a one-dimensional block, they are the same; for a two-dimensional block of size (Dx, Dy),the thread ID of a thread of index (x, y) is (x + y Dx); for a three-dimensional block of size (Dx, Dy, Dz), the thread ID of a thread of index (x, y, z) is (x + y Dx + z Dx Dy).

If you wanted to select a single thread within a warp in your kernel, you can do:

int id = threadIdx.x + threadIdx.y * blockDim.x + threadIdx.z * blockDim.x * blockDim.y;
if (id % 32 == 0) {

    /* First thread of each warp is selected. */

}

Upvotes: 4

Related Questions