pr0py_s
pr0py_s

Reputation: 171

Number of grids, blocks and threads in a GPU

I am new to CUDA and GPU architecture. I ran this code.

In the result, I got only the dimensions of a grid. I know that each grid has 3 dimensions and there are that many blocks. In each block, there are again x*y*z threads.

My question is how many grids are there in a GPU (or is it GPU independent) and if so how do I find it out and also how to handle the cases when a large number of threads is required?

Upvotes: 4

Views: 3179

Answers (2)

Michael Kenzel
Michael Kenzel

Reputation: 15933

A grid effectively represents a kernel launch, i.e., it contains all the blocks (and, thus, threads) that are to be run for one particular kernel launch. There are certain restrictions concerning the dimensions of blocks and grids, which are mainly architecture specific (in general, they are largely the same for all GPU models of the same generation). You can find a detailed list of all the device-specific limits and capabilities in the CUDA programming guide.

How you choose your block dimensions will mainly be motivated by the tradeoff between getting good memory access patterns (divide work in a way that aims to optimize for coalesced global memory access), which groups of threads can communicate through shared memory, and achieving a desired occupancy.

While the maximum block and possibly also grid sizes (though the latter should hardly be an issue) will, thus, affect the way you write and run your CUDA kernels, the maximum number of grids will normally only ever be of concern when you use Dynamic Parallelism. If your kernels do not fully occupy the GPU, the driver may also overlap kernel execution if possible, however, you don't really have explicit control over that.

Upvotes: 3

pr0p
pr0p

Reputation: 2358

Grid in CUDA is like a work space. In the query code, the dimensions you got i.e for dimension 0 in block, refers to number of threads in that block. In modern GPUs that is 1024*1024*64 in most cases. The grid dimensions shows number of blocks in x, y and z dimensions. Grid is like a workspace and you select the number of blocks you need and threads per block you need while running a __ global __ function.

Upvotes: 1

Related Questions