Reputation: 149
I am writing a CUDA program that will probably run on many different GPUs. I would like to know if CUDA provides some way of reading from code (either runtime or compile time) the capabilities of the current GPU, meaning the number of threads a single block can contain, and the maximum number of blocks, so I can tailor the launch of the kernel to optimally use all the resources.
I know it may sound like a silly question but I can't find any answers online.
Bonus question if it is not possible: I see here that someone says they know the Jetson TX1 has
2 SM’s - each with 128 cores. I read that per SM (which I understand there are 2) there can be a maximum of 16 active blocks, and 64 active warps (or 2048 active threads).
How can I find this info for a given GPU?
Upvotes: 1
Views: 621
Reputation: 7157
I guess cudaGetDeviceProperties
seems to be what you are looking for.
Upvotes: 4