Reputation: 796
How do I programatically find the maximum number of concurrent cuda threads or streaming multiprocessors on a device / nvidia graphics card? I know about warpSize
, but there is no warpCount
.
most answers on the internet concern themselves with looking up things from pdfs.
Upvotes: 0
Views: 591
Reputation: 9474
This does not only depend on the device but also on your code - e.g. things like the number of registers each thread uses or the amount of shared memory your block needs. I would suggest reading about occupancy.
Another thing I would note is that if your code relies on having a certain number of threads resident on the device (e.g. if you wait for several threads to reach some execution point) you are bound to face some race conditions and see your code hanging.
Upvotes: 1
Reputation: 2246
Have you tried checking their SDK samples , i think this sample is the one you want Device Query
Upvotes: 2