Reputation: 27125
Are there any generally-applicable tips to reduce the accumulation of floating-point roundoff errors in C or C++? I'm thinking mainly about how to write code that gets compiled into optimal assembly language instructions, although strategies on overall algorithm design are also welcome.
Upvotes: 2
Views: 3983
Reputation: 1167
There are many small things you can do such as doing as many floating point operations as possible in a single expression and making sure that all inputs to the operation are converted into floating point format. When switching between floating point and integers make sure to add a factor of 0.5 to the float before the integer conversion to ensure values are rounded to the closest integer. Using doubles or long doubles will increase the amount of precision and thus lessen the significance of the rounding/accumulated errors.
You will have some amount of roundoff errors, so you really want to push them past the significance you're looking for. One option for this would be using an extended precision floating point software library, such as the High Precision Arithmetic Library. Using a library has a benefit of higher precision at the cost of slower operation.
Upvotes: 1
Reputation: 76458
People get PhD's writing about this stuff, so you won't get really solid advice here, just tips. One tip is to avoid subtracting numbers that are fairly close in value; that amplifies the effect of the noise bits.
Upvotes: 2
Reputation: 52324
Numerical analysis is a whole field of mathematics and it isn't reduced to some tips one can apply blindly.
Upvotes: 6
Reputation: 1141
You can enable extended floating point precision of the FPU to use 10 bytes internally. This is what we use.
http://www.website.masmforum.com/tutorials/fptute/fpuchap1.htm
You can also sort numbers so that operations are performed on numbers of similar magnitude.
Upvotes: 0
Reputation: 308520
The only trick I know is that when you're summing a bunch of numbers, don't do them one at a time - group them so that the additions are on numbers of approximately the same magnitude. To sum a huge array of random numbers for example, recursively sum by pairs.
Upvotes: 4