Reputation: 728
I want to select only one thread per warp for a few operations.
For example, lets start with 1-D block dim of (64, 1, 1)
. As I understand, this will result in two warps considering the warp size is 32. In this case, I can use the following code to access one thread per warp:
if(threadIdx.x % 32 == 0) { ... }
First of all, does this make sense as I am not sure if we know how threads are mapped to the warp on hardware?
Secondly, how can be this achieved for 2-D block dim of (32, 32, 1)
? Now here simple % 32
won't work as the thread indexing in two dimensions will be different?
Thanks.
Upvotes: 2
Views: 596
Reputation: 593
Threads within a block are mapped to warps in a consistent way.
In section 4.1 of the CUDA programming guide:
The way a block is partitioned into warps is always the same; each warp contains threads of consecutive, increasing thread IDs with the first warp containing thread 0.
In section 2.2.1 of the CUDA programming guide:
The index of a thread and its thread ID relate to each other in a straightforward way: For a one-dimensional block, they are the same; for a two-dimensional block of size (Dx, Dy),the thread ID of a thread of index (x, y) is (x + y Dx); for a three-dimensional block of size (Dx, Dy, Dz), the thread ID of a thread of index (x, y, z) is (x + y Dx + z Dx Dy).
If you wanted to select a single thread within a warp in your kernel, you can do:
int id = threadIdx.x + threadIdx.y * blockDim.x + threadIdx.z * blockDim.x * blockDim.y;
if (id % 32 == 0) {
/* First thread of each warp is selected. */
}
Upvotes: 4