Reputation: 374
I have an algorithm I would like to implement, which involves
My addition and multiplication are a little annoying (they are arithmetic modulo n), but I imagine I can port over standard algorithms, e.g. using Montgomery arithmetic, so am not focusing on that topic now.
For this last step, I mean that mapping a vector [a, b, c, d] -> [b, c, d, a] -> [c, d, a, b] -> [d, a, b, c] -> [a,b,c,d]
, though in general my vectors will be ~1000 dimensions, so the general cyclic rotation will be somewhat more complex.
What is the best way to accomplish this third step that I require? Looking into the Thrust library, it seems that the simplest thing to do is to use a permutation_iterator
. This should suffice to provide a correct implementation, but given that the permutation I want to compute is exceedingly simple, I was wondering if this is the fastest option.
Also, while this question is tagged thrust
, if there is some obviously better library/framework to use, I would appreciate pointers.
Upvotes: 1
Views: 134
Reputation: 13295
If you use a transform iterator to build the permutation on-the-fly, it should be very efficient; certainly faster than actually moving the data.
This should work:
#include <thrust/device_vector.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/transform_iterator.h>
struct rotate_right : public thrust::unary_function<int,int>
{
int rotate_by, end;
rotate_right(int rotate_by, int end) noexcept
: rotate_by(rotate_by), end(end)
{ assert(rotate_by >= 0 && rotate_by <= end); }
__host__ __device__
int operator()(int idx) const noexcept
{
idx += rotate_by;
if(idx >= end)
idx -= end;
return idx;
}
};
template<class Iterator>
auto make_rotate_right(Iterator unrotated, int rotate_by, int len)
{
using counting_iterator = thrust::counting_iterator<int>;
counting_iterator base_index(0);
using index_iterator = thrust::transform_iterator<rotate_right, counting_iterator>;
index_iterator permutation{base_index, rotate_right{rotate_by, len}};
return thrust::make_permutation_iterator(unrotated, permutation);
}
thrust::device_vector<float> foo()
{
int n = 4;
thrust::device_vector<float> v(n);
v[0] = 1.0f;
v[1] = 4.0f;
v[2] = 9.0f;
v[3] = 16.0f;
auto rotated = make_rotate_right(v.begin(), 2, n);
return {rotated, rotated + n};
}
int main()
{
auto f = foo();
for(auto fi: f)
std::cout << fi << '\n';
}
Prints
9
16
1
4
Upvotes: 3