Reputation: 131435
The CUDA 11 features announcement, it's said that there are now:
New link time optimization capabilities
what link-time optimizations does nvcc actually employ (e.g. relative to the LTO capabilities in host-side code with g++ or clang++)?
Also - is there something one needs to do to get LTO enabled, or does it always occur (unlike with host-side code where you need to compile with an -flto
switch?
Upvotes: 4
Views: 992
Reputation: 1
my guess is that -dlto has to be with compile time and link time, if you link your program using non-nvcc, such as gcc or g++, then you may not get the best performance
Upvotes: 0
Reputation: 131435
Partial answer:
To enable link-time optimization, use --dlink-time-opt
(or dlto
) when invoking the NVCC compiler, both for compilation and for device-side code linking. No (link-time) optimization will be applied if the compiler can't find the relevant intermediate information.
Upvotes: 2