Reputation: 27125

How to reduce C/C++ floating-point roundoff

Are there any generally-applicable tips to reduce the accumulation of floating-point roundoff errors in C or C++? I'm thinking mainly about how to write code that gets compiled into optimal assembly language instructions, although strategies on overall algorithm design are also welcome.

Upvotes: 2

Answers (5)

syplex

Reputation: 1167

There are many small things you can do such as doing as many floating point operations as possible in a single expression and making sure that all inputs to the operation are converted into floating point format. When switching between floating point and integers make sure to add a factor of 0.5 to the float before the integer conversion to ensure values are rounded to the closest integer. Using doubles or long doubles will increase the amount of precision and thus lessen the significance of the rounding/accumulated errors.

You will have some amount of roundoff errors, so you really want to push them past the significance you're looking for. One option for this would be using an extended precision floating point software library, such as the High Precision Arithmetic Library. Using a library has a benefit of higher precision at the cost of slower operation.

Upvotes: 1

Pete Becker

Reputation: 76458

People get PhD's writing about this stuff, so you won't get really solid advice here, just tips. One tip is to avoid subtracting numbers that are fairly close in value; that amplifies the effect of the noise bits.

Upvotes: 2

AProgrammer

Reputation: 52324

Numerical analysis is a whole field of mathematics and it isn't reduced to some tips one can apply blindly.

Upvotes: 6

Leon

Reputation: 1141

You can enable extended floating point precision of the FPU to use 10 bytes internally. This is what we use.

http://www.website.masmforum.com/tutorials/fptute/fpuchap1.htm

You can also sort numbers so that operations are performed on numbers of similar magnitude.

Upvotes: 0

Mark Ransom

Reputation: 308520

The only trick I know is that when you're summing a bunch of numbers, don't do them one at a time - group them so that the additions are on numbers of approximately the same magnitude. To sum a huge array of random numbers for example, recursively sum by pairs.

Upvotes: 4

How to reduce C/C++ floating-point roundoff

Answers (5)

Related Questions