Serge Ballesta
Serge Ballesta

Reputation: 148890

Incorrect double to long conversion

This is mainly a followup to this other question, that was about a weird conversion from long to double and back again to long for big values.

I already know that converting a float to an integral type does truncate, if that is the truncated value cannot be represented in target type, the behaviour is undefined:

4.9 Floating-integral conversions [conv.fpint]

A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

But here is my code to demonstrate the problem, assuming a little endian architecture, where both long long and long double use 64 bits:

#include <iostream>
#include <iomanip>

using namespace std;

int main()
{
  unsigned long long ull = 0xf000000000000000;
  long double d = static_cast<long double>(ull);
  // dump the IEE-754 number for a little endian system
  unsigned char * pt = reinterpret_cast<unsigned char *>(&d);
  for (int i = sizeof(d) -1; i>= 0; i--) {
      cout << hex << setw(2) << setfill('0') << static_cast<unsigned int>(pt[i]); 
  }
  cout << endl;
  unsigned long long ull2 = static_cast<unsigned long long>(d);
  cout << ull << endl << d << endl << ull2 << endl;
  return 0;
}

The output is (using MSVC 2008 32bits on a old XP 32 box):

43ee000000000000
f000000000000000
1.72938e+019
8000000000000000

Explainations for values:

As that value can be represented as an unsigned long long, I expected that its conversion to an unsigned long long gives original value, and MSVC gives 0x8000000000000000 or 9223372036854775808

The question is: is that conversion caused by undefined behaviour as suggested by the accepted answer to the other question or is it really a MSVC bug?

(Note: same code on CLang compiler on a FreeBSD 10.1 box gives correct results)

For references, I could find the generated code:

  unsigned long long ull2 = static_cast<unsigned long long>(d);
0041159E  fld         qword ptr [d] 
004115A1  call        @ILT+490(__ftol2) (4111EFh) 
004115A6  mov         dword ptr [ull2],eax 
004115A9  mov         dword ptr [ebp-40h],edx 

And the code for _ftol2 seems to be (got from debugger at execution time):

00411C66  push        ebp  
00411C67  mov         ebp,esp 
00411C69  sub         esp,20h 
00411C6C  and         esp,0FFFFFFF0h 
00411C6F  fld         st(0) 
00411C71  fst         dword ptr [esp+18h] 
00411C75  fistp       qword ptr [esp+10h] 
00411C79  fild        qword ptr [esp+10h] 
00411C7D  mov         edx,dword ptr [esp+18h] 
00411C81  mov         eax,dword ptr [esp+10h] 
00411C85  test        eax,eax 
00411C87  je          integer_QnaN_or_zero (411CC5h) 
00411C89  fsubp       st(1),st 
00411C8B  test        edx,edx 
00411C8D  jns         positive (411CADh) 
00411C8F  fstp        dword ptr [esp] 
00411C92  mov         ecx,dword ptr [esp] 
00411C95  xor         ecx,80000000h 
00411C9B  add         ecx,7FFFFFFFh 
00411CA1  adc         eax,0 
00411CA4  mov         edx,dword ptr [esp+14h] 
00411CA8  adc         edx,0 
00411CAB  jmp         localexit (411CD9h) 
00411CAD  fstp        dword ptr [esp] 
00411CB0  mov         ecx,dword ptr [esp] 
00411CB3  add         ecx,7FFFFFFFh 
00411CB9  sbb         eax,0 
00411CBC  mov         edx,dword ptr [esp+14h] 
00411CC0  sbb         edx,0 
00411CC3  jmp         localexit (411CD9h) 
00411CC5  mov         edx,dword ptr [esp+14h] 
00411CC9  test        edx,7FFFFFFFh 
00411CCF  jne         arg_is_not_integer_QnaN (411C89h) 
00411CD1  fstp        dword ptr [esp+18h] 
00411CD5  fstp        dword ptr [esp+18h] 
00411CD9  leave            
00411CDA  ret 

Upvotes: 11

Views: 1617

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 148890

This is mainly a compilation of comments to question.

It appears that old MSVC versions used to incorrectly process conversions of 64 bits integers to 64 bits double precision number.

The bug in present in versions below 2008.

MSCV 2010 is wrong using 32 bits mode and correct in 64 bits mode

All versions starting with 2012 are correct.

Upvotes: 1

Related Questions