Reputation: 11
I am new to cuda programming. I am working on Kepler GPU which has
3.2 compute_capability
1024 max_threads_per_block
1 Multiprocessor
2048 max._threads per_Multiprocessor
2147483647 grid size
Does this mean that I can only assign 2048 for a kernel ?. Then what to do with that large grid size?
My application includes some large no of matrix calculations.
Upvotes: 1
Views: 3557
Reputation: 151879
You'll need to learn more about CUDA programming.
You can have more than 1024 or 2048 threads in a kernel (i.e. a grid).
The limit of 1024 is the per-block limit. The 2048 number is something you don't need to focus on too much if you are a beginner.
In the kernel launch:
mykernel<<<A,B>>>(...);
The B
parameter is the threads per block. It is limited to 1024.
The A
parameter is blocks per grid. It is limited to 2^31-1 (for the x dimension on a Kepler GPU). So you could in theory launch (2^31-1)*1024 threads in a one-dimensional grid, on a cc3.x device.
Upvotes: 3