Reputation: 97
I have the following loop setting:
for( int k = 0 ; k < N ; j++ )
{
while( 1 )
{
for( int i = 0 ; i < N ; i++ )
{
int sum = 0;
for( int j = nums[i] ; j < nums[i+1] ; j++ )
{
sum = sum + nums[i];
}
}
if( sum == some_desired_value )
break;
}
}
I want to parallelise this code using openmp. N is a very large integer. Since the outer k loop is very much independent of what is happening in the i loop, I planned on using parallel for in k and j loops as below. But I know that this is not a correct setting, as already the number of iterations will be distributed per thread in the k loop, and hence, only one thread will be there per k loop iteration, leaving no thread-iteration distribution possible for the inner i loop.
#pragma omp parallel for
for( int k = 0 ; k < N ; j++ )
{
while( 1 )
{
#pragma omp parallel for
for( int i = 0 ; i < N ; i++ )
{
int sum = 0;
for( int j = nums[i] ; j < nums[i+1] ; j++ )
{
sum = sum + nums[i];
}
}
if( sum == some_desired_value )
break;
}
}
I want to know if there is some way possible where I can achieve the above setting?
PS: Nested loops might come handy, but I am not sure if I can use them in this setting?
Upvotes: 0
Views: 384
Reputation: 3031
As long as N is much bigger than the number of physical cpu cores at your disposal, you should not parallelize the inner loop as well. In parallel lingo one would say that the outer loop provides enough parallelism, such that trying to parallelize the inner loop will in the best case add more overhead due to thread creation and in the worst case oversubscribe you system.
Only if regularly N is of the same order of magnitude as the numbber of cores you should start to think about further parallelizing that loop nest, which in this case may be far from trivial.
Upvotes: 1