How is the CUDA() kernel launch syntax implemented

Question

CUDA kernels are launched with this syntax (at least in the runtime API)

mykernel<<>>(args);

Is this implemented as a macro or is it special syntax that nvcc removes before handing host code off to gcc?

Robert Crovella · Accepted Answer

The nvcc preprocessing system eventually converts it to a sequence of CUDA runtime library calls before handing the code off to the host code compiler for compilation. The exact sequence of calls may change depending on CUDA version.

You can inspect files using the --keep option to nvcc (and --verbose may help with understanding as well), and you can also see a trace of API calls issued for a kernel call using one of the profilers e.g. nvprof --print-api-trace ...

---EDIT---

Just to make this answer more concise, nvcc directly modifies the host code to replace the <<<...>>> syntax before passing it off to the host compiler (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#offline-compilation)

How is the CUDA<<<...>>>() kernel launch syntax implemented

Answers (1)

Related Questions

How is the CUDA&lt;&lt;&lt;...&gt;&gt;&gt;() kernel launch syntax implemented

Answers (1)

Related Questions

How is the CUDA<<<...>>>() kernel launch syntax implemented