Reputation: 1618
I am having linking errors while building a project using cmake. The build was doing fine on linux using a makefile (manually generated / no cmake) but windows build is giving me problems. Here is a barebones example to demonstrate my approach:
I have 3 files in the same directory ( kernel.cu, kernel.h, main.c)
main.c:
extern void kernel_wrapper();
int main(){
kernel_wrapper();
}
kernel.h:
#ifndef KERNELH
#define KERNELH
extern "C" void kernel_wrapper();
#endif
kernel.cu:
#include <stdio.h>
#include "kernel.h"
__global__ void kernel (){
printf("hello from GPU!");
}
void kernel_wrapper(){
kernel<<<1,1>>>();
}
I want to use separable compilation so the host code is isolated from device code using externs. Here is the CMakeLists.txt file I am using:
cmake_minimum_required(VERSION 3.9 FATAL_ERROR)
project(cudatest LANGUAGES C CUDA)
add_executable(c-exec main.c)
add_library(cu-lib STATIC kernel.cu kernel.h)
set_target_properties(cu-lib PROPERTIES CUDA_SEPARABLE_COMPILATION ON)
target_link_libraries(c-exec PUBLIC cu-lib)
The library and the executable compile successfully but when it comes to linking step, the following error occurs:
cu-lib.lib(kernel.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBi
nary_41_tmpxft_00003dd0_00000000_7_kernel_cpp1_ii_b81a68a1 referenced in function "void _
_cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXP
EAPEAX@Z) [C:\cudatest\build\c-exec.vcxproj]
C:\cudatest\build\Debug\c-exec.exe : fatal error LNK1120: 1 unresolved externals [C:\cuda
test\build\c-exec.vcxproj]
It sounds like the linker is missing some includes to libraries (probably cudart?) however, link command on the console looks normal to me, with the static includes and everything:
Link:
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64\link.exe /ERRORREP
ORT:QUEUE /OUT:"C:\cudatest\build\Debug\c-exec.exe" /INCREMENTAL /NOLOGO /LIBPATH:"C:\P
rogram Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\lib\x64" "Debug\cu-lib.lib" cudart_
static.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut3
2.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTUAC:"level='asInvoker' uiAc
cess='false'" /manifest:embed /DEBUG /PDB:"C:/cudatest/build/Debug/c-exec.pdb" /SUBSYST
EM:CONSOLE /TLBID:1 /DYNAMICBASE /NXCOMPAT /IMPLIB:"C:/cudatest/build/Debug/c-exec.lib"
/MACHINE:X64 /machine:x64 "c-exec.dir\Debug\main.obj"
I am clueless at the moment. If I give up on the separate compilation (change add_executable parameters to "c-exec main.c kernel.h kernel.cu" and skip library step) problem disappears, but I am required to use separate compilation.
This question looks relevant to mine and I tried to apply the solution suggested there. Placing externs around include line and before function definition changed nothing.
I am using Windows 10 with Cmake v3.11.4, CUDA v9.2 and select cmake generator using "cmake -G "Visual Studio 15 2017 Win64" -T v140" since the latest version of Visual Studio doesn't get supported by CUDA yet.
Edit: Added full console output
Build started 12.07.2018 03:03:42.
Project "C:\cudatest\build\ALL_BUILD.vcxproj" on node 1 (default targets).
Project "C:\cudatest\build\ALL_BUILD.vcxproj" (1) is building "C:\cudatest\build\ZERO_
CHECK.vcxproj" (2) on node 1 (default targets).
PrepareForBuild:
Creating directory "x64\Debug\ZERO_CHECK\".
Creating directory "x64\Debug\ZERO_CHECK\ZERO_CHECK.tlog\".
InitializeBuildStatus:
Creating "x64\Debug\ZERO_CHECK\ZERO_CHECK.tlog\unsuccessfulbuild" because "AlwaysCre
ate" was specified.
CustomBuild:
Checking Build System
CMake does not need to re-run because C:/cudatest/build/CMakeFiles/generate.stamp is
up-to-date.
FinalizeBuildStatus:
Deleting file "x64\Debug\ZERO_CHECK\ZERO_CHECK.tlog\unsuccessfulbuild".
Touching "x64\Debug\ZERO_CHECK\ZERO_CHECK.tlog\ZERO_CHECK.lastbuildstate".
Done Building Project "C:\cudatest\build\ZERO_CHECK.vcxproj" (default targets).
Project "C:\cudatest\build\ALL_BUILD.vcxproj" (1) is building "C:\cudatest\build\c-exe
c.vcxproj" (3) on node 1 (default targets).
Project "C:\cudatest\build\c-exec.vcxproj" (3) is building "C:\cudatest\build\cu-lib.v
cxproj" (4) on node 1 (default targets).
PrepareForBuild:
Creating directory "cu-lib.dir\Debug\".
Creating directory "C:\cudatest\build\Debug\".
Creating directory "cu-lib.dir\Debug\cu-lib.tlog\".
InitializeBuildStatus:
Creating "cu-lib.dir\Debug\cu-lib.tlog\unsuccessfulbuild" because "AlwaysCreate" was
specified.
CustomBuild:
Building Custom Rule C:/cudatest/CMakeLists.txt
CMake does not need to re-run because C:/cudatest/build/CMakeFiles/generate.stamp is
up-to-date.
AddCudaCompileDeps:
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64\cl.exe /E /nolo
go /showIncludes /TP /D__CUDACC__ /D_WINDOWS /DCMAKE_INTDIR="Debug" /DCMAKE_INTDIR="
Debug" /D_MBCS /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include" /
I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin" /I"C:\Program Files\N
VIDIA GPU Computing Toolkit\CUDA\v9.2\include" /I. /FIcuda_runtime.h /c C:\cudatest\
kernel.cu
Project "C:\cudatest\build\cu-lib.vcxproj" (4) is building "C:\cudatest\build\cu-lib.v
cxproj" (4:2) on node 1 (CudaBuildCore target(s)).
CudaBuildCore:
Compiling CUDA source file ..\kernel.cu...
cmd.exe /C "C:\Users\dulls\AppData\Local\Temp\tmpd0a36f1624184bf982dc8082e9a727be.cm
d"
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\nvcc.exe" -gencode=arch
=compute_30,code=\"sm_30,compute_30\" --use-local-env -ccbin "C:\Program Files (x86)
\Microsoft Visual Studio 14.0\VC\bin\x86_amd64" -x cu -rdc=true -I"C:\Program Files\
NVIDIA GPU Computing Toolkit\CUDA\v9.2\include" -I"C:\Program Files\NVIDIA GPU Compu
ting Toolkit\CUDA\v9.2\include" --keep-dir x64\Debug -maxrregcount=0 --machine
64 --compile -cudart static -Xcompiler="/EHsc -Zi -Ob0" -g -D"_WINDOWS" -D"CMAKE_I
NTDIR=\"Debug\"" -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O
d /FS /Zi /RTC1 /MDd /GR" -o cu-lib.dir\Debug\kernel.obj "C:\cudatest\kernel.cu"
C:\cudatest\build>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\nvcc.
exe" -gencode=arch=compute_30,code=\"sm_30,compute_30\" --use-local-env -ccbin "C:\P
rogram Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64" -x cu -rdc=true -I
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include" -I"C:\Program File
s\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include" --keep-dir x64\Debug -maxrregc
ount=0 --machine 64 --compile -cudart static -Xcompiler="/EHsc -Zi -Ob0" -g -D"_W
INDOWS" -D"CMAKE_INTDIR=\"Debug\"" -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -Xcompiler "/E
Hsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd /GR" -o cu-lib.dir\Debug\kernel.obj "C:\cudat
est\kernel.cu"
kernel.cu
Done Building Project "C:\cudatest\build\cu-lib.vcxproj" (CudaBuildCore target(s)).
Lib:
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64\Lib.exe /OUT:"C
:\cudatest\build\Debug\cu-lib.lib" /NOLOGO /MACHINE:X64 /machine:x64 "cu-lib.dir\De
bug\kernel.obj"
cu-lib.vcxproj -> C:\cudatest\build\Debug\cu-lib.lib
FinalizeBuildStatus:
Deleting file "cu-lib.dir\Debug\cu-lib.tlog\unsuccessfulbuild".
Touching "cu-lib.dir\Debug\cu-lib.tlog\cu-lib.lastbuildstate".
Done Building Project "C:\cudatest\build\cu-lib.vcxproj" (default targets).
PrepareForBuild:
Creating directory "c-exec.dir\Debug\".
Creating directory "c-exec.dir\Debug\c-exec.tlog\".
InitializeBuildStatus:
Creating "c-exec.dir\Debug\c-exec.tlog\unsuccessfulbuild" because "AlwaysCreate" was
specified.
CustomBuild:
Building Custom Rule C:/cudatest/CMakeLists.txt
CMake does not need to re-run because C:/cudatest/build/CMakeFiles/generate.stamp is
up-to-date.
ClCompile:
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64\CL.exe /c /I"C:
\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include" /Zi /nologo /W3 /WX-
/Od /Ob0 /D "WIN32" /D "_WINDOWS" /D "CMAKE_INTDIR=\"Debug\"" /D _MBCS /Gm- /RTC1 /M
Dd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /Fo"c-exec.dir\Debug\\" /Fd"c
-exec.dir\Debug\vc140.pdb" /Gd /TC /errorReport:queue C:\cudatest\main.c
main.c
Link:
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64\link.exe /ERROR
REPORT:QUEUE /OUT:"C:\cudatest\build\Debug\c-exec.exe" /INCREMENTAL /NOLOGO /LIBPATH
:"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\lib\x64" "Debug\cu-lib.lib
" cudart_static.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32
.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTUAC:"level='
asInvoker' uiAccess='false'" /manifest:embed /DEBUG /PDB:"C:/cudatest/build/Debug/c-
exec.pdb" /SUBSYSTEM:CONSOLE /TLBID:1 /DYNAMICBASE /NXCOMPAT /IMPLIB:"C:/cudatest/bu
ild/Debug/c-exec.lib" /MACHINE:X64 /machine:x64 "c-exec.dir\Debug\main.obj"
Creating library C:/cudatest/build/Debug/c-exec.lib and object C:/cudatest/build/
Debug/c-exec.exp
cu-lib.lib(kernel.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinke
dBinary_41_tmpxft_00013688_00000000_7_kernel_cpp1_ii_b81a68a1 referenced in function "
void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallb
ack@@YAXPEAPEAX@Z) [C:\cudatest\build\c-exec.vcxproj]
C:\cudatest\build\Debug\c-exec.exe : fatal error LNK1120: 1 unresolved externals [C:\c
udatest\build\c-exec.vcxproj]
Done Building Project "C:\cudatest\build\c-exec.vcxproj" (default targets) -- FAILED.
Done Building Project "C:\cudatest\build\ALL_BUILD.vcxproj" (default targets) -- FAILE
D.
Build FAILED.
"C:\cudatest\build\ALL_BUILD.vcxproj" (default target) (1) ->
"C:\cudatest\build\c-exec.vcxproj" (default target) (3) ->
(Link target) ->
cu-lib.lib(kernel.obj) : error LNK2019: unresolved external symbol __cudaRegisterLin
kedBinary_41_tmpxft_00013688_00000000_7_kernel_cpp1_ii_b81a68a1 referenced in function
"void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCal
lback@@YAXPEAPEAX@Z) [C:\cudatest\build\c-exec.vcxproj]
C:\cudatest\build\Debug\c-exec.exe : fatal error LNK1120: 1 unresolved externals [C:
\cudatest\build\c-exec.vcxproj]
0 Warning(s)
2 Error(s)
Upvotes: 2
Views: 2214
Reputation: 1618
I think I got a solution on the issue. ( thanks for the guidance @talonmies )
I had to force device linking using:
set_property(TARGET c-lib PROPERTY CUDA_RESOLVE_DEVICE_SYMBOLS ON)
and it works!
from CUDA_RESOLVE_DEVICE_SYMBOLS
man page:
CUDA only: Enables device linking for the specific static library target
If set this will enable device linking on this static library target. Normally device linking is deferred until a shared library or executable is generated, allowing for multiple static libraries to resolve device symbols at the same time.
I am still not sure why my executable generation doesn't trigger a device link before host linking everything together.
Edit: Looks like it is a known bug which affects windows builds but works fine on linux.
Upvotes: 3