Reputation: 569
I have a GTS 450 which has 4 SM's with 48 cores on each SM . ie 192 CUDA cores
Consider I m using limited register memory and shared memory
With compute capability 2.1 , What would be the optimal block size and thread size to achieve 100% occupancy ?
Upvotes: 0
Views: 140
Reputation: 1746
You need to download the appropriate CUDA Toolkit (v5.0) and look for CUDA Occupancy Calculator excel sheet, if you haven't already downloaded it. If so you use the document I mention above. It would tell you how to achieve 100% or whatever occupancy you want once you set the initial properties/parameters. One among others is computer capability. Which is also the most important I'd say.
Upvotes: 1