Can i just define CUDA kernels in .h files?

Question

I have difficulties understanding the way i should handle different files in a CUDA program:

I am trying to restructure a CUDA program i have been working on for a while. So far it was more or less a one-file-program. I had 1 .cu file which contained all the CUDA code as well as the main function. I had several header files that were included, but they contained only non-CUDA functions. The program is getting bigger and messier and i want to structure the kernels into different files for readability.

Initially i thought the way to do this is to have .cuh files. I didn't get that to work, so i tried to get my head around this, which suggests a .h file and a .cu file. However the program would not build anymore after including other .cu files in it. It would typically either not recognize CUDA keywords such als "__global__" or it would throw errors in external includes, which seemed unrelated.

I noticed however, that it builds when i define the kernel in a .h file. I have the feeling this is not a good idea, but don't know what the problem with it is. What bothers me, is that from my understanding the .h files should not even be compiled by nvcc, so how does it still work? I have great trouble understanding what the best way to go about this is.

I am using Visual Studio 2012 and CUDA 5.5

Robert Crovella · Accepted Answer

The rules and behavior here aren't really any different conceptually than what is permissible in C or C++ coding.

For a file that is explicitly included in another file via an #include directive, the file name, and indeed the file extension - .cu, .h, .cuh. .hpp or what have you, really doesn't matter. That is just a directive to the compiler to pick up that file, and insert it at this point in the source, just as if it had been typed there.

So a statement like I couldn't get .cuh to work but I could get .h to work really doesn't make sense. The compiler doesn't care about the filename. Things like .cuh and .h are just naming conventions to help us organize large code bases.

Files don't get compiled unless they are in or included in a source module (e.g. .cu or .c or .cpp, etc.) The compiler doesn't separately compile header files (precompiled headers is another subject, not relevant to this discussion). It only compiles them if they are included in a source module.

The danger of defining a function in a header file is that if you include the header file in more than one source module, the funcion will be defined (i.e. compiled in) more than one source module. Usually you don't want this, as it tends to lead to multiple definition errors.

If you intend a header file to be included in one and only one source module, there is no real problem with placing some of your code (i.e. definitions) in that header file. But the typical usage for header files is declarations, not definitions.

A __global__ function for this discussion is really no different than any other C/C++ function. Putting a kernel definition in a header file runs the risk of multiple definition errors, if you include it in more than one source module. If you only include it in one source module, it's fine if that is what you want to do.

Can i just define CUDA kernels in .h files?

Answers (1)

Related Questions