Reputation: 7128
I cant seem to understand the wording for the CUDA kernel parameters <<<gridSize, blockSize>>>
In the code I am reviewing they are defined as
const dim3 blockSize(1, 1, 1);
const dim3 gridSize( 1, 1, 1);
Replacing the hardcoded 1s with variable reference, would they be properly name if they were named like so
const dim3 blockSize(nThreadsX, nThreadsY, nThreadsZ);
const dim3 gridSize(nBlocksX, nBlocksY, nBlocksZ);
where the maximum value that any argument to blockSize
can be is set by the hardware (something like 512 or 1024?) and is the maximum number of threads that will run in a block in a single dimension?
Upvotes: 1
Views: 135
Reputation: 151869
Yes, the proposed naming is sensible. Those dim3
parameters are intended to represent (x,y,z) dimensions. The block is composed of threads. The grid is composed of blocks.
Using your naming, nBlocksX
, nBlocksY
and nBlocksZ
must all be less than corresponding hardware-defined limits. Those limits can be discovered from the programming guide (Table 12) or programmatically using a method such as that contained in the deviceQuery sample app.
There are similar limits for nThreadsX
, nThreadsY
and nThreadsZ
, but in addition, the product nThreadsX
* nThreadsY
* nThreadsZ
must also satisfy another limit (Maximum number of threads per block, which is either 512 or 1024 for current CUDA GPU hardware.
Upvotes: 2