Reputation: 8711
When the program as shown below is run, it produces ok output:
j= 0 9007199616606190.000000 = x
k= 0 9007199616606190.000000 = [x]
r= 31443101 0.000000 = m*(x-[x])
But when the commented-out line (i.e. //if (argc>1) r = atol(argv[1]);
) is uncommented, it produces:
j= 20000 9007199616606190.000000 = x
k= 17285 9007199616606190.000000 = [x]
r= 31443101 0.000000 = m*(x-[x])
even though that line should have no effect, since argc>1
is false. Has anybody got a plausible explanation for this problem? Is it reproducible on any other systems?
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main(int argc, char *argv[]) {
int j, k, m=10000;
double r=31443101, jroot=sqrt(83), x;
//if (argc>1) r = atol(argv[1]);
x = r * r * jroot;
j = m*(x-floor(x));
k = floor(m*(x-floor(x)));
printf ("j= %9d %24.6f = x\n", j, x);
printf ("k= %9d %24.6f = [x]\n", k, floor(x));
printf ("r= %9.0f %24.6f = m*(x-[x]) \n", r, m*(x-floor(x)));
return 0;
}
Note, test system = AMD Athlon 64 5200+ system with Linux 2.6.35.14-96.fc14.i686 (i.e., booted to run a 32-bit OS on 64-bit HW) with gcc (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4)
Update -- A few hours ago I posted a comment that code generated with and without the if
statement differed only in stack offsets and some skipped code. I now find that comment was not entirely correct; i.e. it is true for non-optimized code, but not true for the -O3 code I executed.
Effect of optimization switch on problem:
j=20000
and k=17285
j=20000
(an error) and k=0
(OK)Anyhow, looking at -O3 -S code listings, the two cases differ mostly in skipped if
code and stack offsets up to the line before call floor
, at which point the with-if code has one more fstpl
than the without-if code:
... ;; code without comment:
fmul %st, %st(1)
fxch %st(1)
fstpl (%esp)
fxch %st(1)
fstpl 48(%esp)
fstpl 32(%esp)
call floor
movl $.LC2, (%esp)
fnstcw 86(%esp)
movzwl 86(%esp), %eax
...
... ;; versus code with comment:
fmul %st, %st(1)
fxch %st(1)
fstpl (%esp)
fxch %st(1)
fstpl 48(%esp)
fstpl 32(%esp)
fstpl 64(%esp)
call floor
movl $.LC3, (%esp)
fnstcw 102(%esp)
movzwl 102(%esp), %eax
...
I haven't figured out the reason for the difference.
Upvotes: 9
Views: 324
Reputation: 71556
EDITED:
I also do not see a difference when I compile your code on my computer using the -O0, -O1, -O2 and -O3.
AMD Phenom Quad 64 bit. gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3
I also tried clang (llvm) from release 3.0 with and without the if same results.
I agree that the compiler can pre-compute everything without that if line, you would definitely see that in the assembly output though.
Floating point and C can be nasty, lots of stuff to know to get it to really work. Forcing the int to double conversions is good for accuracy (c libraries in the compiler, even if the fpu is good have been known to have problems and the compilers C library it uses and the C library compiled into or used by your program can/will differ), but int to/from float is where FPU's tend to have their bugs (I think I saw that mentioned with TestFloat or somewhere like that). Might try running TestFloat on your system to see if your FPU is good. Between the famous pentium floating point bug and into the PentiumIV and beyond days most processors had floating point bugs, the pentium III I had was solid but the Pentium IV I had would fail. I rarely use floating point anymore so dont bother to test my systems.
Playing with optimization did change your results according to your edit so this is most likely a gcc problem or a combination of your code and gcc (and not a hardware fpu problem). Then try a different version of gcc on the same computer. 4.4.x instead of 4.5.x for example.
Upvotes: 0
Reputation: 239171
The reason that uncommenting that line might affect the result is that without that line, the compiler can see that r
and jroot
cannot change after initialisation, so it can calculate x
at compile-time rather than runtime. When the line is uncommented, r
might change, so the calculation of x
must be deferred to runtime, which can result it in being done with a different precision (particularly if 387 floating point math is being used).
You can try using -mfpmath=sse -march=native
to use the SSE unit for floating point calculations, which doesn't exhibit excess precision; or you can try using the -ffloat-store
switch.
Your subtraction x - floor(x)
exhibits catastrophic cancellation - this is the root cause of the problem something to be avoided ;).
Upvotes: 2
Reputation: 881993
Not duplicated on my system, Win7 running CygWin with gcc 4.3.4. Both with and without the if
statement, the value of j
is set to zero, not 20K.
My only suggestion would be to use gcc -S
to get a look at the assembler output. That should hopefully tell you what's going wrong.
Specifically, generate the assembler output to two separate files, one each for the working and non-working variant, then vgrep them (eyeball them side by side) to try and ascertain the difference.
This is a serious failure in your environment by the way. With m
being 10000, that means the x - floor(x)
must be equal to 2. I can't for the life of me think of any real number where that would be the case :-)
Upvotes: 3
Reputation: 183446
I think there are two reasons why that line could have an effect:
x
, basically restricts x
to what can be stored in a double-precision floating-point number, whereas your first version can allow the initially-calculated value for x
to still be available as an intermediate value, with extra bits, when computing subsequent values. (This could be the case whether all of these values are computed at compile-time or at run-time.)Upvotes: 2