How to get the mantissa of an 80-bit long double as an int on x86-64

Question

frexpl won't work because it keeps the mantissa as part of a long double. Can I use type punning, or would that be dangerous? Is there another way?

Peter Cordes · Accepted Answer

x86's float and integer endianness is little-endian, so the significand (aka mantissa) is the low 64 bits of an 80-bit x87 long double.

In assembly, you just load the normal way, like mov rax, [rdi].

Unlike IEEE binary32 (float) or binary64 (double), 80-bit long double stores the leading 1 in the significand explicitly. (Or 0 for subnormal). https://en.wikipedia.org/wiki/Extended_precision#x86_extended_precision_format

So the unsigned integer value (magnitude) of the true significand is the same as what's actually stored in the object-representation.

If you want signed int, too bad; including the sign bit it would be 65 bits but int is only 32-bit on any x86 C implementation.

If you want int64_t, you could maybe right shift by 1 to discard the low bit, making room for a sign bit. Then do 2's complement negation if the sign bit was set, leaving you with a signed 2's complement representation of the significand value divided by 2. (IEEE FP uses sign/magnitude with a sign bit at the top of the bit-pattern)

In C/C++, yes you need to type-pun, e.g. with a union or memcpy. All C implementations on x86 / x86-64 that expose 80-bit floating point at all use a 12 or 16-byte type with the 10-byte value at the bottom.

Beware that MSVC uses long double = double, a 64-bit float, so check LDBL_MANT_DIG from float.h, or sizeof(long double). All 3 static_assert() statements trigger on MSVC, so they all did their job and saved us from copying a whole binary64 double (sign/exp/mantissa) into our uint64_t.

// valid C11 and C++11
#include   // float numeric-limit macros
#include 
#include   // C11 static assert
#include   // memcpy

// inline
uint64_t ldbl_mant(long double x)
{
    // we can assume CHAR_BIT = 8 when targeting x86, unless you care about DeathStation 9000 implementations.
    static_assert( sizeof(long double) >= 10, "x87 long double must be >= 10 bytes" );
    static_assert( LDBL_MANT_DIG == 64, "x87 long double significand must be 64 bits" );

    uint64_t retval;
    memcpy(&retval, &x, sizeof(retval));
    static_assert( sizeof(retval) < sizeof(x), "uint64_t should be strictly smaller than long double" ); // sanity check for wrong types
    return retval;
}

This compiles efficiently on gcc/clang/ICC (on Godbolt) to just one instruction as a stand-alone function (because the calling convention passes long double in memory). After inlining into code with a long double in an x87 register, it will presumably compile to a TBYTE x87 store and an integer reload.

## gcc/clang/ICC -O3 for x86-64
ldbl_mant:
  mov rax, QWORD PTR [rsp+8]
  ret

For 32-bit, gcc has a weird redundant-copy missed-optimization bug which ICC and clang don't have; they just do the 2 loads from the function arg without copying first.

# GCC -m32 -O3  copies for no reason
ldbl_mant:
  sub esp, 28
  fld TBYTE PTR [esp+32]            # load the stack arg
  fstp TBYTE PTR [esp]              # store a local
  mov eax, DWORD PTR [esp]
  mov edx, DWORD PTR [esp+4]        # return uint64_t in edx:eax
  add esp, 28
  ret

C99 makes union type-punning well-defined behaviour, and so does GNU C++. I think MSVC defines it too.

But memcpy is always portable so that might be an even better choice, and it's easier to read in this case where we just want one element.

If you also want the exponent and sign bit, a union between a struct and long double might be good, except that padding for alignment at the end of the struct will make it bigger. It's unlikely that there'd be padding after a uint64_t member before a uint16_t member, though. But I'd worry about :1 and :15 bitfields, because IIRC it's implementation-defined which order the members of a bitfield are stored in.

How to get the mantissa of an 80-bit long double as an int on x86-64

Answers (1)

Related Questions