James Waldby - jwpat7
James Waldby - jwpat7

Reputation: 8711

Floating point anomaly when an unused statement is not commented out?

When the program as shown below is run, it produces ok output:

 j=         0    9007199616606190.000000 = x
 k=         0    9007199616606190.000000 = [x]
 r=  31443101                   0.000000 = m*(x-[x]) 

But when the commented-out line (i.e. //if (argc>1) r = atol(argv[1]);) is uncommented, it produces:

 j=     20000    9007199616606190.000000 = x
 k=     17285    9007199616606190.000000 = [x]
 r=  31443101                   0.000000 = m*(x-[x]) 

even though that line should have no effect, since argc>1 is false. Has anybody got a plausible explanation for this problem? Is it reproducible on any other systems?

 #include <stdio.h>
 #include <stdlib.h>
 #include <math.h>
 int main(int argc, char *argv[]) {
   int  j, k, m=10000;
   double r=31443101, jroot=sqrt(83), x;
   //if (argc>1) r = atol(argv[1]);
   x = r * r * jroot;
   j = m*(x-floor(x));
   k = floor(m*(x-floor(x)));
   printf ("j= %9d   %24.6f = x\n", j, x);
   printf ("k= %9d   %24.6f = [x]\n", k, floor(x));
   printf ("r= %9.0f   %24.6f = m*(x-[x]) \n", r, m*(x-floor(x))); 
   return 0;
 }

Note, test system = AMD Athlon 64 5200+ system with Linux 2.6.35.14-96.fc14.i686 (i.e., booted to run a 32-bit OS on 64-bit HW) with gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4)

Update -- A few hours ago I posted a comment that code generated with and without the if statement differed only in stack offsets and some skipped code. I now find that comment was not entirely correct; i.e. it is true for non-optimized code, but not true for the -O3 code I executed.

Effect of optimization switch on problem:

Anyhow, looking at -O3 -S code listings, the two cases differ mostly in skipped if code and stack offsets up to the line before call floor, at which point the with-if code has one more fstpl than the without-if code:

    ...  ;; code without comment:
fmul    %st, %st(1)
fxch    %st(1)
fstpl   (%esp)
fxch    %st(1)
fstpl   48(%esp)
fstpl   32(%esp)
call    floor
movl    $.LC2, (%esp)
fnstcw  86(%esp)
movzwl  86(%esp), %eax
    ...
    ...  ;; versus code with comment:
fmul    %st, %st(1)
fxch    %st(1)
fstpl   (%esp)
fxch    %st(1)
fstpl   48(%esp)
fstpl   32(%esp)
fstpl   64(%esp)
call    floor
movl    $.LC3, (%esp)
fnstcw  102(%esp)
movzwl  102(%esp), %eax
    ...

I haven't figured out the reason for the difference.

Upvotes: 9

Views: 324

Answers (4)

old_timer
old_timer

Reputation: 71556

EDITED:

I also do not see a difference when I compile your code on my computer using the -O0, -O1, -O2 and -O3.

AMD Phenom Quad 64 bit. gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3

I also tried clang (llvm) from release 3.0 with and without the if same results.

I agree that the compiler can pre-compute everything without that if line, you would definitely see that in the assembly output though.

Floating point and C can be nasty, lots of stuff to know to get it to really work. Forcing the int to double conversions is good for accuracy (c libraries in the compiler, even if the fpu is good have been known to have problems and the compilers C library it uses and the C library compiled into or used by your program can/will differ), but int to/from float is where FPU's tend to have their bugs (I think I saw that mentioned with TestFloat or somewhere like that). Might try running TestFloat on your system to see if your FPU is good. Between the famous pentium floating point bug and into the PentiumIV and beyond days most processors had floating point bugs, the pentium III I had was solid but the Pentium IV I had would fail. I rarely use floating point anymore so dont bother to test my systems.

Playing with optimization did change your results according to your edit so this is most likely a gcc problem or a combination of your code and gcc (and not a hardware fpu problem). Then try a different version of gcc on the same computer. 4.4.x instead of 4.5.x for example.

Upvotes: 0

caf
caf

Reputation: 239171

The reason that uncommenting that line might affect the result is that without that line, the compiler can see that r and jroot cannot change after initialisation, so it can calculate x at compile-time rather than runtime. When the line is uncommented, r might change, so the calculation of x must be deferred to runtime, which can result it in being done with a different precision (particularly if 387 floating point math is being used).

You can try using -mfpmath=sse -march=native to use the SSE unit for floating point calculations, which doesn't exhibit excess precision; or you can try using the -ffloat-store switch.

Your subtraction x - floor(x) exhibits catastrophic cancellation - this is the root cause of the problem something to be avoided ;).

Upvotes: 2

paxdiablo
paxdiablo

Reputation: 881993

Not duplicated on my system, Win7 running CygWin with gcc 4.3.4. Both with and without the if statement, the value of j is set to zero, not 20K.

My only suggestion would be to use gcc -S to get a look at the assembler output. That should hopefully tell you what's going wrong.

Specifically, generate the assembler output to two separate files, one each for the working and non-working variant, then vgrep them (eyeball them side by side) to try and ascertain the difference.


This is a serious failure in your environment by the way. With m being 10000, that means the x - floor(x) must be equal to 2. I can't for the life of me think of any real number where that would be the case :-)

Upvotes: 3

ruakh
ruakh

Reputation: 183446

I think there are two reasons why that line could have an effect:

  • Without that line, the values of all of these variables can be (and, IMHO, most likely are) determined at compile-time; with that line, the computations have to be performed at run-time. But obviously, the compiler's precomputed values are supposed to be the same as values computed at run-time, and I'm inclined to discount this as the actual reason for the different observed behavior. (It would certainly show up as a huge difference in the assembler output, though!)
  • On many machines, floating-point arithmetic is performed using more bits in intermediate values than can actually be stored in a double-precision floating-point number. Your second version, by creating two different code-paths to set x, basically restricts x to what can be stored in a double-precision floating-point number, whereas your first version can allow the initially-calculated value for x to still be available as an intermediate value, with extra bits, when computing subsequent values. (This could be the case whether all of these values are computed at compile-time or at run-time.)

Upvotes: 2

Related Questions