Force CUDA's thrust::reduce to execute with no parallelism

Question

I have a CUDA program that uses thrust::reduce to parallelize sums: for example,

thrust::device_ptr tmp(aux);
double my_sum = thrust::reduce(tmp, tmp + G);

where double* aux points to G contiguous doubles on the device. I need to compare the runtime of the whole parallelized program to a version with no parallel computation. Is there a way to run thrust::reduce using only a single thread on the device? A global switch would be the most convenient option.

Force CUDA's thrust::reduce to execute with no parallelism

Answers (1)

Related Questions

Force CUDA&#39;s thrust::reduce to execute with no parallelism

Answers (1)

Related Questions

Force CUDA's thrust::reduce to execute with no parallelism