Reputation: 18178
I'm receiving the compiler error
static_assert failed: 'Attempt to use an extended __device__ lambda in a context that requires querying its return type in host code. Use a named function object, a __host__ __device__ lambda, or cuda::proclaim_return_type instead.'
when compiling this code:
thrust::device_vector<float2> a;
thrust::device_vector<float> b;
float param1, param2;
float2 param3;
thrust::transform_reduce(
thrust::make_zip_iterator(thrust::make_tuple(a.begin(), b.begin())),
thrust::make_zip_iterator(thrust::make_tuple(a.end(), b.end())),
[param1, param2, param3] __device__ (thrust::tuple<float2, float> const& tuple)
{
/* do something and return a float2 */
},
float2{},
[] __device__ (float2 const& first, float2 const& second)
{
float2 result{};
result.x = first.x + second.x;
result.y = first.y + second.y;
return result;
});
How can I rewrite this code so that it compiles? I'm new to CUDA/Thrust and its quite obscure to me how I need to deal with this problem.
(BTW, do I really need the complicaed "transformation" function here? thrust::plus<float2>
doesn't work.)
Upvotes: 2
Views: 482
Reputation: 16242
To complement Robert Crovella's answer this is what worked and didn't work for me with cuda 12.0 and cuda 12.5 for a typical thrust function over a GPU container:
auto res2 = thrust::transform_reduce(
x.begin(), x.end(),
// vvv--- works (?) in cuda 12, doesn't work with cuda 12.5
// [] __device__ (T const& e) {return std::abs(e.real()) + std::abs(e.imag());},
// vvv--- works (?) in cuda 12.0, doesn't work with cuda 12.5
// [] __device__ (T const& e) -> double {return std::abs(e.real()) + std::abs(e.imag());},
// vvv--- doesn't work in cuda 12.0, works with cuda 12.5
// cuda::proclaim_return_type<double>([] __device__ (T const& e) {return std::abs(e.real()) + std::abs(e.imag());}),
// vvv--- doesn't work
// [] (T const& e) constexpr {return std::abs(e.real()) + std::abs(e.imag());},
// vvv--- works, but needs function to be host also
[] __host__ __device__ (T const& e) {return std::abs(e.real()) + std::abs(e.imag());},
double{}, thrust::plus<>{}
);
(by "doesn't work" I mean "doesn't compile", by "works" I mean "compiles and gives the right result, apparently")
Upvotes: 0
Reputation: 151869
How can I rewrite this code so that it compiles?
A possible solution seems to be indicated in the assert message you posted in your question:
static_assert failed: 'Attempt to use an extended
__device__
lambda in a context that requires querying its return type in host code. Use a named function object, a__host__ __device__
lambda, or cuda::proclaim_return_type instead.'
When I change each lambda to be decorated with __host__ __device__
instead of just __device__
then the code compiles for me on CUDA 12.1
(BTW, do I really need the complicaed "transformation" function here? thrust::plus doesn't work.)
CUDA doesn't provide arithmetic operators for the vector types supplied by CUDA, and AFAIK thrust doesn't either:
then x+y must be defined
So you'll need to provide your own definition to add float2
types, i.e.
__host__ __device__ __forceinline__
float2 operator+(float2 left, float2 right) noexcept {
return float2{left.x + right.x,
left.y + right.y};
}
will allow you to use thrust::plus<float2>
.
Upvotes: 2