Reputation: 18870
cudaGetDeviceProperties has attributes for getting the compute capability (major.minor), but, how do we get the GPU architecture (sm_**) to feed into the compilation for a device?
Upvotes: 3
Views: 7609
Reputation: 1
simplest way
if you are using cuda 7.x , using nvcc flags like below to gain compatibility
-arch=sm_30 \
-gencode=arch=compute_20,code=sm_20 \
-gencode=arch=compute_30,code=sm_30 \
-gencode=arch=compute_50,code=sm_50 \
-gencode=arch=compute_52,code=sm_52
if you are using cuda 8.x, set the flags like below:
-arch=sm_30 \
-gencode=arch=compute_20,code=sm_20 \
-gencode=arch=compute_30,code=sm_30 \
-gencode=arch=compute_50,code=sm_50 \
-gencode=arch=compute_52,code=sm_52 \
-gencode=arch=compute_60,code=sm_60 \
-gencode=arch=compute_61,code=sm_61 \
-gencode=arch=compute_62,code=sm_62 \
Upvotes: -1
Reputation: 151899
sm_XY corresponds to "physical" or "real" architecture
compute_ZW corresponds to "virtual" architecture
not all sm_XY have a corresponding compute_XY
for example, there is no compute_21 (virtual) architecture
Upvotes: 11