mrx_hk
mrx_hk

Reputation: 31

Cuda thrust without using device_vectors?

I've written kernels using normal CUDA code which does not use thrust device vectors. The kernels outputs some results stored in an array on the device, say array X. I'd like to now do a reduction on X. Is there a way to use the thrust::reduction function without first copying X into a thrust::device_vector variable?

Upvotes: 2

Views: 403

Answers (1)

talonmies
talonmies

Reputation: 72349

The traditional way to do this is to wrap the device pointer into a thrust::device_ptr and pass that to thrust algorithms. The tag based template model in Thrust will ensure that a device execution results because of the type of the input sequence supplied in the call.

#include <thrust/device_ptr.h>
#include <thrust/reduce.h>

int* X;
cudaMalloc((void **)&X, sizeof(int) * size_t(N));

// Do stuff with X

thrust::device_ptr X_ptr(X);
int result = thrust::reduce(X_ptr, X_ptr+N);

From Thrust 1.7 onwards, the concept of execution policies have been introduced. This removes the need for explicit wrapping of the device address using device_ptr. So instead you can use to thrust::device policy to indicate that the input iterators are on the device and do something like

#include <thrust/reduce.h>
#include <thrust/execution_policy.h>

int* X;
cudaMalloc((void **)&X, sizeof(int) * size_t(N));

// Do stuff with X
int result = thrust::reduce(thrust::device, X, X+N);

Which way you choose to do this should be guided by what version of Thrust you have and which code style you prefer.

Upvotes: 4

Related Questions