Pan.da
Pan.da

Reputation: 41

Use thrust to find element in groups

I have two int vectors for keys and values, their size is about 500K.

The key vector is already sorted. And there are 10K groups approximately.

The value is non-negative(stands for useful) or -2(stands for no use), in each group there should be one or zero non-negative values, and the rest is -2.

key:   0  0  0  0 1  2  2  3  3  3  3 
value:-2 -2  1 -2 3 -2 -2 -2 -2 -2  0

The third pair of group 0 [0 1] is useful. For group 1 we get the pair[1 3]. The values of group 2 are all -2, so we get nothing. And for group 3, the result is [3 0].

So, the question is how can I do this by thrust or cuda ?

Here are two ideas.

First one: Get the number of each group by a histogram algorithm. So the barrier of each group can be computed. Operate thrust::find_if on each group to get the useful element.

Second one: Use thrust::transform to add 2 for every value and now all the value are non-negative, and zero stands for useless. Use thrust::reduce_by_key to get the reduction for every group, and then subtract 2 for every output value.

I think there must be some other methods which will achieve much more performance than the above two.

Performance of the methods:

I have test the Second method above and the method given by @Robert Crovella, ie. reduce_by_key and remove_if method. The size of the vectors is 2691028, the vectors consist of 100001 groups. Here is their average time:

reduce_by_key: 1204ms
remove_if: 192ms

From above result, we can see that remove_if method is much faster. And also the "remove_if" method is easy to implement and consume much less gpu memory.

Briefly, @Robert Crovella 's method is very good.

Upvotes: 0

Views: 670

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 151972

I would use a thrust::zip_iterator to zip the key and values pairs together, and then I would do a thrust::remove_if operation on the zipped values which would require a functor definition that would indicate to remove every pair for which the value is negative (or whatever test you wish.)

Here's a worked example:

$ cat t1009.cu
#include <thrust/remove.h>
#include <thrust/device_vector.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/copy.h>

struct remove_func
{
  template <typename T>
  __host__ __device__
  bool operator()(T &t){
    return (thrust::get<1>(t) < 0); // could change to other kinds of tests
    }
};

int main(){

  int keys[] = {0,0,0,0,1,2,2,3,3,3,3};
  int vals[] = {-2,-2,1,-2,3,-2,-2,-2,-2,-2,0};
  size_t dsize = sizeof(keys)/sizeof(int);

  thrust::device_vector<int>dkeys(keys, keys+dsize);
  thrust::device_vector<int>dvals(vals, vals+dsize);
  auto zr = thrust::make_zip_iterator(thrust::make_tuple(dkeys.begin(), dvals.begin()));

  size_t rsize = thrust::remove_if(zr, zr+dsize, remove_func()) - zr;

  thrust::copy_n(dkeys.begin(), rsize, std::ostream_iterator<int>(std::cout, ","));
  std::cout << std::endl;
  thrust::copy_n(dvals.begin(), rsize, std::ostream_iterator<int>(std::cout, ","));
  std::cout << std::endl;
  return 0;
}
$ nvcc -std=c++11 -o t1009 t1009.cu
$ ./t1009
0,1,3,
1,3,0,
$

Upvotes: 4

Related Questions