Reputation: 1412
I have an implementation of a digital bessel filter which is a performance bottleneck for my program. I would like to parallelize the main loop with OpenMP.
The function takes an input array paddedsignal
and arrays of filter coefficients dcof
and ccof
and produces a temporary array temp
which is filtered. The main loop of this process looks like this:
for (i=order; i<end; i++)
{
temp[i] = ccof[0]*paddedsignal[i];
for (p=1; p<=order; p++)
{
temp[i] += ccof[p]*paddedsignal[i-p] - dcof[p]*temp[i-p];
}
}
In particular, note that the value of temp[i]
depends on previous values of temp[i-p]
. This confounds a simple #pragma omp for
directive, since the value of temp[i]
near the boundaries between sections of the array which would be handled by different threads has race condition problems. Basically if thread one takes indices between 0-99 and thread 2 takes 100-199, the value at 100 will be wrong since it will be computed before the value at 99, on which it depends.
Is there a way to salvage this situation? I am despairing as to what I can do to parallelize this since the fact that filtered values depend on neighboring values makes it inherently a serial calculation. Is there any way around this?
Upvotes: 1
Views: 485
Reputation: 22660
You basically cannot parallelize the outer loop due to the dependency you correctly identify.
You could parallelize the inner loop with a reduction. But that would only be helpful for large order
and even then probably just lead to false sharing because the data is shuffled around. One might concieve a complicated data laout to interleave signal
and temp
among threads, which would require sort-of a manual work sharing construct. I doubt it would be worth it anyway, since the reduction will add overhead cost.
It is easier to parallelize filters that have a finite impulse response, if that is an alternative for you. You could also segment the whole signal into slightly overlapping segments and compute the filter independently, but there will be some computational error at the borders of those chunks.
Upvotes: 1
Reputation: 664
Maybe you are looking for the #pragma omp for ordered. It executes the code in the same order in which it would be executed sequentially
But keep in mind:
For more information see: omp ordered
Upvotes: 0