Reputation: 317

Float rounding error in time-critical C++ loop, looking for an efficient solution

As a premise, I am aware that this problem has been addressed already, but never in this specific scenario, from what I could find searching.

In a time-critical piece of code, I have a loop where a float value x must grow linearly from exactly 0 to-and-including exactly 1 in 'z' steps.

The un-optimized solution, but which would work without rounding errors, is:

const int z = (some number);
int c;
float x;

for(c=0; c<z; c++)
{
   x = (float)c/(float)(z-1);
   // do something with x here
}

obviously I can avoid the float conversions and use two loop variables and caching (float)(z-1):

const int z = (some number);
int c;
float xi,x;
const float fzm1 = (float)(z-1);

for(c=0,xi=0.f; c<z; c++, xi+=1.f)
{
   x=xi/fzm1;
   // do something with x
}

But who would ever repeat a division by a constant for every loop pass ? Obviously anyone would turn it into a multiplication:

const int z = (some number);
int c;
float xi,x;
const float invzm1 = 1.f/(float)(z-1);

for(c=0,xi=0.f; c<z; c++, xi+=1.f)
{
   x=xi * invzm1;
   // do something with x
}

Here is where obvious rounding issues may start to manifest. For some integer values of z, (z-1)*(1.f/(float)(z-1)) won't give exactly one but 0.999999..., so the value assumed by x in the last loop cycle won't be exactly one.

If using an adder instead, i.e

const int z = (some number);
int c;
float x;
const float x_adder = 1.f/(float)(z-1);

for(c=0,x=0.f; c<z; c++, x+=x_adder)
{
   // do something with x
}

the situation is even worse, because the error in x_adder will build up.

So the only solution I can see is using a conditional somewhere, like:

const int z = (some number);
int c;
float xi,x;
const float invzm1 = 1.f/(float)(z-1);

for(c=0,xi=0.f; c<z; c++, xi+=1.f)
{
   x = (c==z-1) ? 1.f : xi * invzm1;
   // do something with x
}

but in a time-critical loop a branch should be avoided if possible !

Oh, and I can't even split the loop and do


for(c=0,xi=0.f; c<z-1; c++, xi+=1.f) // note: loop runs now up to and including z-2
{
   x=xi * invzm1;
   // do something with x
}

x=1.f;
// do something with x

because I would have to replicate the whole block of code 'do something with x' which is not short or simple either, I cannot make of it a function call (would be inefficient, too many local variables to pass) nor I want to use #defines (would be very poor and inelegant and impractical).

Can you figure out any efficient or smart solution to this problem ?

Upvotes: 1

Answers (4)

Homer512

Reputation: 13419

First, a general consideration: xi += 1.f introduces a loop-carried dependency chain of however many cycles your CPU needs for floating point addition (probably 3 or 4). It also kills any attempt at vectorization unless you compile with -ffast-math. If you run on a modern super-scalar desktop CPU, I recommend using an integer counter and converting to float in each iteration.

In my opinion, avoiding int->float conversions is outdated advise from the era of x87 FPUs. Of course you have to consider the entire loop for the final verdict but the throughput is generally comparable to floating point addition.

For the actual problem, we may look at what others have done, for example Eigen in the implementation of their LinSpaced operation. There is also a rather extensive discussion in their bug tracker.

Their final solution is so simple that I think it is okay to paraphrase it here, simplified for your specific case:

float step = 1.f / (n - 1);
for(int i = 0; i < n; ++i)
    float x = (i + 1 == n) ? 1.f : i * step;

The compiler may choose to peel off the last iteration to get rid of the branch but in general it is not too bad anyway. In scalar code branch prediction will work well. In vectorized code it's a packed compare and a blend instruction.

We may also force the decision to peel off the last iteration by restructuring the code appropriately. Lambdas are very helpful for this since they are a) convenient to use and b) very strongly inlined.

auto loop_body = [&](int i, float x) mutable {
    ...;
};
for(int i = 0; i < n - 1; ++i)
    loop_body(i, i * step);
if(n > 0)
    loop_body(n - 1, 1.f);

Checking with Godbolt (using a simple array initialization for the loop body), GCC only vectorizes the second version. Clang vectorizes both but does a better job with the second.

Upvotes: 4

Damir Tenishev

Reputation: 3380

What you need is Bresenham's line algorithm.

It would allow you to avoid multiplication and divisions and use add/sub only. Just scale your range so that it could be represented by integer numbers and round up at final stage if precise split to parts is mathematically (or "representatively") impossible.

Upvotes: 1

Eugene

Reputation: 7688

I suggest that you start with the last alternative you have shown and use lambda to avoid passing local variables:

auto do_something_with_x = [&](float x){/*...*/}

for(c=0,xi=0.f; c<z-1; c++, xi+=1.f) // note: loop runs now up to and including z-2
{
   x=xi * invzm1;
   do_something_with_x(x);
}

do_something_with_x(1.f);

Upvotes: 0

Arkady

Reputation: 2207

Consider using this:

const int z = (some number > 0);
const int step = 1000000/z;
for(int c=0; c<z-1; ++c)
{
   x += step; //just if you really need the conversion, divide it by 1000000 when required
   // do something with x
}
x = 1.f;
//do the last step with x

No conversions if you don't really need it, first and last values are as expected, multiplication is reduced to accumulation. By changing of 1000000 you can manually control the precision.

Upvotes: 0

Float rounding error in time-critical C++ loop, looking for an efficient solution

Answers (4)

Related Questions