user8462
user8462

Reputation: 11

Maximum number of threads for a kernel

I am new to cuda programming. I am working on Kepler GPU which has

3.2 compute_capability 
1024  max_threads_per_block 
1 Multiprocessor 
2048 max._threads per_Multiprocessor 
2147483647 grid size

Does this mean that I can only assign 2048 for a kernel ?. Then what to do with that large grid size?

My application includes some large no of matrix calculations.

Upvotes: 1

Views: 3557

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 151879

You'll need to learn more about CUDA programming.

You can have more than 1024 or 2048 threads in a kernel (i.e. a grid).

The limit of 1024 is the per-block limit. The 2048 number is something you don't need to focus on too much if you are a beginner.

In the kernel launch:

mykernel<<<A,B>>>(...);

The B parameter is the threads per block. It is limited to 1024.

The A parameter is blocks per grid. It is limited to 2^31-1 (for the x dimension on a Kepler GPU). So you could in theory launch (2^31-1)*1024 threads in a one-dimensional grid, on a cc3.x device.

Upvotes: 3

Related Questions