Hieu Pham
Hieu Pham

Reputation: 97

CUDA gridDim, blockDim and threadIdx

This is a conceptual question. In CUDA, gridDim, blockDim and threadIdx can be 1D, 2D or 3D. I wonder, how are their 2D and 3D versions interpreted?

In more details, does CUDA think of multi-dimensional gridDim, blockDim and threadIdx just as a linear sequence, in the same way that C stores multi-dimensional array? If not, how should we interpret multi-dimensional gridDim, blockDim and threadIdx?

Thanks.

Edit 1. This question is not a duplicated one. I actually have come across the referred question. It asks about the order of execution of the GPU threads, not their layouts, as this one does.

Edit 2. Also, the answer to this question can be found at http://docs.nvidia.com/cuda/cuda-c-programming-guide/#thread-hierarchy. Thank you @talonmies, for the reference. To sum it up, multi-dimensional gridDim, blockDim and threadIdx is for convenience purposes. They can be interpreted just like a column major ordered multi-dimensional array.

Upvotes: 0

Views: 2912

Answers (1)

talonmies
talonmies

Reputation: 72372

Quoting directly from the CUDA programming guide

The index of a thread and its thread ID relate to each other in a straightforward way: For a one-dimensional block, they are the same; for a two-dimensional block of size (Dx, Dy),the thread ID of a thread of index (x, y) is (x + y Dx); for a three-dimensional block of size (Dx, Dy, Dz), the thread ID of a thread of index (x, y, z) is (x + y Dx + z Dx Dy).

So, yes the logical thread numbering in the programming model is sequential, with then the x dimension varying fastest, then the y dimension, then the z dimension. This applies both to thread numbering within blocks and block numbering within a grid. The numbering is analogous to column major ordered multi-dimensional arrays, although the actual threadIdx and blockIdx variables themselves are just structures reflecting internal thread and block identification words assign by the scheduler to each thread or block.

You should note that numbering implied by threadIdx and blockIdx are just for programmer convenience and don't imply anything about execution order of threads on the GPU.

Upvotes: 2

Related Questions