Reputation: 2449
frexpl
won't work because it keeps the mantissa as part of a long double. Can I use type punning, or would that be dangerous? Is there another way?
Upvotes: 0
Views: 975
Reputation: 365237
x86's float and integer endianness is little-endian, so the significand (aka mantissa) is the low 64 bits of an 80-bit x87 long double
.
In assembly, you just load the normal way, like mov rax, [rdi]
.
Unlike IEEE binary32 (float
) or binary64 (double
), 80-bit long double stores the leading 1 in the significand explicitly. (Or 0
for subnormal). https://en.wikipedia.org/wiki/Extended_precision#x86_extended_precision_format
So the unsigned integer value (magnitude) of the true significand is the same as what's actually stored in the object-representation.
If you want signed int
, too bad; including the sign bit it would be 65 bits but int
is only 32-bit on any x86 C implementation.
If you want int64_t
, you could maybe right shift by 1 to discard the low bit, making room for a sign bit. Then do 2's complement negation if the sign bit was set, leaving you with a signed 2's complement representation of the significand value divided by 2. (IEEE FP uses sign/magnitude with a sign bit at the top of the bit-pattern)
In C/C++, yes you need to type-pun, e.g. with a union or memcpy
. All C implementations on x86 / x86-64 that expose 80-bit floating point at all use a 12 or 16-byte type with the 10-byte value at the bottom.
Beware that MSVC uses long double
= double
, a 64-bit float, so check LDBL_MANT_DIG
from float.h
, or sizeof(long double)
. All 3 static_assert()
statements trigger on MSVC, so they all did their job and saved us from copying a whole binary64 double
(sign/exp/mantissa) into our uint64_t
.
// valid C11 and C++11
#include <float.h> // float numeric-limit macros
#include <stdint.h>
#include <assert.h> // C11 static assert
#include <string.h> // memcpy
// inline
uint64_t ldbl_mant(long double x)
{
// we can assume CHAR_BIT = 8 when targeting x86, unless you care about DeathStation 9000 implementations.
static_assert( sizeof(long double) >= 10, "x87 long double must be >= 10 bytes" );
static_assert( LDBL_MANT_DIG == 64, "x87 long double significand must be 64 bits" );
uint64_t retval;
memcpy(&retval, &x, sizeof(retval));
static_assert( sizeof(retval) < sizeof(x), "uint64_t should be strictly smaller than long double" ); // sanity check for wrong types
return retval;
}
This compiles efficiently on gcc/clang/ICC (on Godbolt) to just one instruction as a stand-alone function (because the calling convention passes long double
in memory). After inlining into code with a long double
in an x87 register, it will presumably compile to a TBYTE x87 store and an integer reload.
## gcc/clang/ICC -O3 for x86-64
ldbl_mant:
mov rax, QWORD PTR [rsp+8]
ret
For 32-bit, gcc has a weird redundant-copy missed-optimization bug which ICC and clang don't have; they just do the 2 loads from the function arg without copying first.
# GCC -m32 -O3 copies for no reason
ldbl_mant:
sub esp, 28
fld TBYTE PTR [esp+32] # load the stack arg
fstp TBYTE PTR [esp] # store a local
mov eax, DWORD PTR [esp]
mov edx, DWORD PTR [esp+4] # return uint64_t in edx:eax
add esp, 28
ret
C99 makes union type-punning well-defined behaviour, and so does GNU C++. I think MSVC defines it too.
But memcpy
is always portable so that might be an even better choice, and it's easier to read in this case where we just want one element.
If you also want the exponent and sign bit, a union between a struct and long double
might be good, except that padding for alignment at the end of the struct will make it bigger. It's unlikely that there'd be padding after a uint64_t
member before a uint16_t
member, though. But I'd worry about :1
and :15
bitfields, because IIRC it's implementation-defined which order the members of a bitfield are stored in.
Upvotes: 3