Reputation: 463
I am using Visual Studio 2012 and when I launch my program in Debug mode, it runs smoothly. However, when I run it in release mode, I get the error "too many resources requested for launch" on one of my kernel.
I went to compare the compilation parameters between debug and release, and it seems that when I change the option "Generate GPU Debug Information" to "Yes (-G)", the problem disappears.
Why is it so ? Is there something I am missing to be able to run the program without that option ?
Compile options:
D:\Dev\CUDA\bin\nvcc.exe -gencode=arch=compute_50,code=\"sm_50,compute_50\" --use-local-env --cl-version 2012 -ccbin "D:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64" -ID:\Dev\CUDA\include -ID:\Dev\CUDA\include -G -lineinfo --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MD " -o x64\Release\main.cu.obj "D:\Dev\Projets\CUDA\ProjAdvMetrixCuda6\Discrete choice v2\main.cu"
Upvotes: 0
Views: 2620
Reputation: 151972
Why is it so ?
It's likely due to a registers per thread issue. Code generation is significantly different between release and debug versions, and this affects the registers used per GPU thread. If you use too many, a kernel will not launch.
You can quickly confirm this by modifying this particular command line switch:
-maxrregcount=0
to some other value. This is possible to do in one of the Visual Studio project configuration fields. I would start with a value of something like 20 for this. If that causes the release project to run, then you have a registers per thread issue. You can get more info about that by studying some of the answers discussing it already, such as the answer here
Note that this problem is not necessarily just pertinent to release vs. debug. Anything that affects code generation could lead to a similar problem, such as 32bit/64bit, or other project or compiler setting differences. The solution path is the same.
Upvotes: 2