Reputation: 41
I have two int vectors for keys and values, their size is about 500K.
The key vector is already sorted. And there are 10K groups approximately.
The value is non-negative(stands for useful) or -2(stands for no use), in each group there should be one or zero non-negative values, and the rest is -2.
key: 0 0 0 0 1 2 2 3 3 3 3
value:-2 -2 1 -2 3 -2 -2 -2 -2 -2 0
The third pair of group 0 [0 1]
is useful. For group 1 we get the pair[1 3]
. The values of group 2 are all -2, so we get nothing. And for group 3, the result is [3 0]
.
So, the question is how can I do this by thrust or cuda ?
Here are two ideas.
First one:
Get the number of each group by a histogram algorithm. So the barrier of each group can be computed.
Operate thrust::find_if
on each group to get the useful element.
Second one:
Use thrust::transform
to add 2 for every value and now all the value are non-negative, and zero stands for useless.
Use thrust::reduce_by_key
to get the reduction for every group, and then subtract 2 for every output value.
I think there must be some other methods which will achieve much more performance than the above two.
Performance of the methods:
I have test the Second method above and the method given by @Robert Crovella, ie. reduce_by_key and remove_if method. The size of the vectors is 2691028, the vectors consist of 100001 groups. Here is their average time:
reduce_by_key: 1204ms
remove_if: 192ms
From above result, we can see that remove_if method is much faster. And also the "remove_if" method is easy to implement and consume much less gpu memory.
Briefly, @Robert Crovella 's method is very good.
Upvotes: 0
Views: 670
Reputation: 151972
I would use a thrust::zip_iterator
to zip the key and values pairs together, and then I would do a thrust::remove_if
operation on the zipped values which would require a functor definition that would indicate to remove every pair for which the value is negative (or whatever test you wish.)
Here's a worked example:
$ cat t1009.cu
#include <thrust/remove.h>
#include <thrust/device_vector.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/copy.h>
struct remove_func
{
template <typename T>
__host__ __device__
bool operator()(T &t){
return (thrust::get<1>(t) < 0); // could change to other kinds of tests
}
};
int main(){
int keys[] = {0,0,0,0,1,2,2,3,3,3,3};
int vals[] = {-2,-2,1,-2,3,-2,-2,-2,-2,-2,0};
size_t dsize = sizeof(keys)/sizeof(int);
thrust::device_vector<int>dkeys(keys, keys+dsize);
thrust::device_vector<int>dvals(vals, vals+dsize);
auto zr = thrust::make_zip_iterator(thrust::make_tuple(dkeys.begin(), dvals.begin()));
size_t rsize = thrust::remove_if(zr, zr+dsize, remove_func()) - zr;
thrust::copy_n(dkeys.begin(), rsize, std::ostream_iterator<int>(std::cout, ","));
std::cout << std::endl;
thrust::copy_n(dvals.begin(), rsize, std::ostream_iterator<int>(std::cout, ","));
std::cout << std::endl;
return 0;
}
$ nvcc -std=c++11 -o t1009 t1009.cu
$ ./t1009
0,1,3,
1,3,0,
$
Upvotes: 4