Reputation: 1797
I have a vector which contains some floating point values reasonably apart from each other and sorted according to some function. For example,
double foo(double x)
{
return 199.1*x;
}
double x = 3000000.3157;
double y = x + DBL_EPSILON;
std::vector<double> s { y,y+10};
std::sort(s.begin(),s.end(),[](double x,double y) { return foo(x) < foo(y) ;} );
Now someone has a key which is close to the one I have in s
, such as x
. In the era of lambda, all have their own little function to search, like
std::cout<<std::distance(s.begin(),std::lower_bound(s.begin(),s.end(),x,
[] (double x,double y) { return foo(x) < foo(y);}))<<std::endl;
std::cout<<std::distance(s.begin(),std::lower_bound(s.begin(),s.end(),x,
[] (double x,double y) { double f1 = foo(x);
double f2 = foo(y);
return f1 < f2;}))<<std::endl;
and gets different position (and the corresponding values are very different).
While looking at the usage, it appeared, they are related to finding for a key k
r
, which (ideally should be [0,1] ) attached to consecutive values x1
& x2
such that return value of a function f(x1,x2,r)
is approximately equal to k
.Both of them looks like related, and related to interpolation. How do I implement them?
NOTE:
In the short code below
double f1 = foo(x);
double f2 = foo(y);
bool l = foo(x) < foo(y);
std::cout<<std::boolalpha<<(f1<f2)<< " "<<l<<" "<<(f1 == f2) << std::endl;
std::cout << std::boolalpha << (foo(x) < foo(y)) << " "<< (foo(y) < foo(x))
<< " "<<(foo(x) == foo(y) )<<std::endl;
std::cout << std::boolalpha << std::isless(foo(x) , foo(y))
<< " "<< std::isless(foo(y) , foo(x)) <<std::endl;
I get output with GCC on a X86 machine as
false true true
true true false
false false
While my guess is that GCC does higher precision (80bit) on the fly unless I force it to store the result, resulting different result for l
& (f1<f2)
(which caused the problem stated above) . I will also be interested to know why foo(x) < foo(y)
and foo(y) < foo(x)
both says true
!
Upvotes: 3
Views: 246
Reputation: 17956
These two statements accomplish nothing, because DBL_EPSILON
is smaller than 1ulp for these numbers:
double x = 3000000.3157;
double y = x + DBL_EPSILON;
To be sure, I printed the hexadecimal representation of both x
and y
and got the following:
4146E3602868DB8C
4146E3602868DB8C
When I run the example at the bottom of your question through a couple different versions of G++ (4.4.5 and 4.8.0) with optimization both on (-O3) and off (no flags), I get the following output:
false false true
false false true
0 0
I suspect the behavior you're seeing is for precisely the reason you postulate: Your compiler is carrying greater precision for intermediate results and that's bleeding through to these comparisons.
What version of the compiler are you using, and does other code in the application adjust any of the rounding modes? What compile flags are you using?
EDIT 1
I was able to reproduce your behavior by recompiling with optimization off and in 32-bit mode. In that mode, I see that the compiler leaves the result of foo
on the floating point stack:
_Z3food:
.LFB1053:
.cfi_startproc
pushl %ebp #
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp #,
.cfi_def_cfa_register 5
subl $8, %esp #,
movl 8(%ebp), %eax # x, tmp61
movl %eax, -8(%ebp) # tmp61, x
movl 12(%ebp), %eax # x, tmp62
movl %eax, -4(%ebp) # tmp62, x
fldl -8(%ebp) # x
fldl .LC0 #
fmulp %st, %st(1) #,
leave
That suggests that this is a quirk of the i386 ABI. To test this theory, I looked at the i386 ABI more closely. On page 38 of this PDF (aka. "page 3-12" by the internal page numbers), I find what is likely the smoking gun:
%st(0)
Floating-point return values appear on the top of the floating- point register stack; there is no difference in the representation of single- or double-precision values in floating-point registers. If the function does not return a floating-point value, then this register must be empty. This register must be empty before G entry to a function.
It goes on to say a few paragraphs later:
A floating-point return value appears on the top of the Intel387 register stack. The caller then must remove the value from the Intel387 stack, even if it doesn’t use the value. Failure of either side to meet its obligations leads to undefined program behavior. The standard calling sequence does not include any method to detect such failures nor to detect return value type mismatches. Therefore the user must declare all functions properly. There is no difference in the representation of single-, double- or extended-precision values in floating-point registers.
Searching further down to pages 3-27 (PDF page 53) and 3-28 (PDF page 54) give the following confusing twists. The table in Figure 3-30 suggests that the initial rounding mode is "53-bit (double precision)", and that that's the mode at process initialization.
It goes on further to give the following warning on the next page:
The initial floating-point state should be changed with care. In particular, many floating-point routines may produce undefined behavior if the precision control is set to less than 53 bits. The _fpstart routine (see Chapter 6) changes the precision control to 64 bits and sets all exceptions to be asked. This is the default state required for conformance to the ANSI C standard and to the IEEE 754 Floating-point standard.
A couple references on the net indicate Linux does indeed set the x87 to extended precision (at least in the 32-bit ABI).
EDIT 2
It appears extended precision is indeed the culprit. I added the following code to the test case, as suggested by this page:
void set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}
// ...
set_fpu(0x27F);
With those lines added, the test case returns the same values I saw with the 64-bit ABI.
So, assuming you're compiling a 32-bit program under Linux, this appears to be the reason you're seeing strange comparison and sorting results.
Can you re-run your sorting and searching code with the FPU set to 53-bit precision as I did above, and see if that resolves the differences you saw between your two lambda expressions?
Upvotes: 6