Reputation: 3520
According to http://en.wikipedia.org/wiki/CUDA, Maximum x- or y-dimension of a block 1024 Maximum z-dimension of a block 64
Does it mean we can have 1024 x 1024 x 64 threads per block or we can have a maximum of 1024+64 threads in a block?
Upvotes: 2
Views: 1553
Reputation: 72372
The limits are defined in Append G of recent programming guides, but the answer is either 512 or 1024 threads per block total, depending on whether you have a Fermi, or older card.
So for Fermi
blockDim.x * blockDim.y * blockDim.z <= 1024
and for GT200/G90/G80/Ion:
blockDim.x * blockDim.y * blockDim.z <= 512
Note the are other resource limits (shared memory and registers) which may require block sizes to be be smaller than the limits, depending on code complexity. This is also discussed at some length in the programming guide.
Upvotes: 5