irritable_phd_syndrome
irritable_phd_syndrome

Reputation: 5077

CUDA fails to recognize nvcuda namespace during compilation

I am following the CUDA tutorial on using the V100 tensor cores. My MWE code :

$ cat src/wmma.cu
#include <cuda_runtime_api.h>
#include <mma.h>
using namespace nvcuda;
int main(void){
    return 0;
} 

Compiling it with CUDA 9.0,

$ nvcc src/wmma.cu
src/wmma.cu(10): error: name must be a namespace name

1 error detected in the compilation of "/gpfs0/scratch/1430008/tmpxft_0002054c_00000000-8_wmma.cpp1.ii".

If I add the option --gpu-architecture=compute_62, I still get the same error. CPATH is set to /opt/cuda/9.0/include:, so I believe that I'm not having difficulty finding the header files.

When I comment out the using namespace nvcuda, it compiles and executes as expected.

QUESTION:

  1. Why is my compilation of this trivial code failing?

Upvotes: 4

Views: 3127

Answers (1)

talonmies
talonmies

Reputation: 72345

Why is my compilation of this trivial code failing?

Because you must specify a compilation architecture which supports these features, otherwise they are undefined:

$ cat nvnvnv.cu 
#include <cuda_runtime_api.h>
#include <mma.h>
using namespace nvcuda;
int main(void){
    return 0;
} 


$ nvcc nvnvnv.cu 
nvnvnv.cu(3): error: name must be a namespace name

1 error detected in the compilation of "/tmp/tmpxft_00005444_00000000-8_nvnvnv.cpp1.ii".

The default compilation architecture is sm_30 on the compiler I am using (CUDA 9.2). Specifying the correct architecture makes the error disappear:

$ nvcc -arch=sm_70 nvnvnv.cu 

$

Referring you to the (very useful) CUDA tag wiki:

If you are finding that you are getting syntax errors on CUDA keywords when compiling device code, make sure you are compiling using nvcc and that your source file has the expected .cu extension. If you find that CUDA device functions or feature namespaces you expect to work are not found (atomic functions, warp voting functions, half-precision arithmetic, cooperative groups, etc.), ensure that you are explicitly passing compilation arguments which enable architecture settings which support those features.

Upvotes: 8

Related Questions