Why do I need to include to use CUDA's printf()?

Question

I want to printf() something in my CUDA kernel. The Programming Guide suggests I do that like so:

#include 

__global__ void helloCUDA(float f)
{
    printf("Hello thread %d, f=%f
", threadIdx.x, f);
}

But this is simply including the standard C library's stdio.h. Why would that be necessary? CUDA's printf() doesn't have the same behavior of stdio's printf(); and I certainly don't need most of everything else that's in there.

talonmies · Accepted Answer

It's an implementation detail you don't need to know about which stems from limitations in the CUDA syntax (basically it is illegal to define different __device__ and __host__ versions of the same function).

The standard library prototype is used as a proxy in device code during compilation, and when compiling for a supported architecture, some sneaky template overloading is used to insert the device implementation into the device code.

Why do I need to include <stdio.h> to use CUDA's printf()?

Answers (1)

Related Questions

Why do I need to include &lt;stdio.h&gt; to use CUDA&#39;s printf()?

Answers (1)

Related Questions

Why do I need to include <stdio.h> to use CUDA's printf()?