Reputation: 43
I'm working on a project that requires massive parallel computing. However, the tricky problem is that, the project contains a nested loop, like this:
for(int i=0; i<19; ++i){
for(int j=0; j<57; ++j){
//the computing section
}
}
To achieve the highest gain, I need to parallelise those two levels of loops. Like this:
parallel_for_each{
parallel_for_each{
//computing section
}
}
I tested and found that AMP doesn't support nested for loops. Anyone have any idea on this problem? Thanks
Upvotes: 4
Views: 1018
Reputation: 13723
You could, as @High Performance Mark suggest collapse the two loops into one. However, you don't need to do this with C++ AMP because it supports 2 and 3 dimensional extent
s on array
s and array_view
s. You can the use an index
as a multi-dimensional index.
array<float, 2> x(19,57);
parallel_for_each(x.extent, [=](index<2> idx) restrict(amp)
{
x[idx] = func(x[idx]);
});
float func(const float v) restrict(amp) { return v * v; }
You can access the individual sub-indeces in idx
using:
int row = idx[0];
int col = idx[1];
You should also consider the amount of work being done by computing section
. If it is relatively small you may want to have each thread process more than one element of the array, x
.
The following article is also worth reading as just like the CPU if your loops do not access memory efficiently it can have a big impact on performance. Arrays are Row Major in C++ AMP
Upvotes: 4
Reputation: 78324
So collapse the loops:
for(int ij=0; ij<19*57; ++ij){
//if required extract i and j from ij
//the computing section
}
}
Upvotes: 0