Approximately reversible function from a pair of floats to a single float?

Question

Some time ago, I came across a pair of functions in some CAD code to encode a set of coordinates (a pair of floats) as a single float (to use as a hash key), and then to unpack that single float back into the original pair.

The forward and backward functions only used standard mathematical operations -- no magic fiddling with bit-level representations of floats, no extracting and interleaving individual digits or anything like that. Obviously the reversal is not perfect in practice, because you lose considerable precision going from two floats to one, but according to the Wikipedia page for the function it should have been exactly invertible given infinite precision arithmetic.

Unfortunately, I don't work on that code anymore, and I've forgotten the name of the function so I can't look it up on Wikipedia again. Anybody know of a named mathematical functions that meets that description?

njuffa · Accepted Answer

In a comment, OP clarified that the desired function should map two IEEE-754 binary64 operands into a single such operand. One way to accomplish this is to map each binary64 (double-precision) number into a positive integer in [0, 2²⁶-2], and then use a well-known pairing function to map two such integers into a single positive integer the interval [0,2⁵²), which is exactly representable in a binary64 which has 52 stored significand ("mantissa") bits.

As the binary64 operands are unrestricted in range per a comment by OP, all binades should be representable with equal relative accuracy, and we need to handle zeroes, infinities, and NaNs as well. For this reason I chose log2() to compress the data. Zeros are treated as the smallest binary64 subnormal, 0x1.0p-1074, which has the consequence that both 0x1.0p-1074 and zero will decompress into zero. The result from log2 falls into the range [-1074, 1024). Since we need to store the sign of the operand, we bias the logarithm value by 1074, giving a result in [0, 2098), then scale that to almost [0, 2²⁵), round to the nearest integer, and finally affix the sign of the original binary64 operand. The motivation for almost utilizing the complete range is to leave a little bit of room at the top of the range for special encodings for infinity and NaN (so a single canonical NaN encoding is used).

Since pairing functions known from the literature operate on natural numbers only, we need a mapping from whole numbers to natural numbers. This is easily accomplished by mapping negative whole numbers to odd natural numbers, while positive whole numbers are mapped to even natural numbers. Thus our operands are mapped from (-2²⁵, +2²⁵) to [0, 2²⁶-2]. The pairing function then combines two integers in [0, 2²⁶-2] into a single integer in [0, 2⁵²).

Different pairing functions known from the literature differ in their scrambling behavior, which may impact the hashing functionality mentioned in the question. They may also differ in their performance. Therefore I am offering a selection of four different pairing functions for the pair() / unpair() implementations in the code below. Please see the comments in the code for corresponding literature references.

Unpacking of the packed operand involves applying the inverse of each packing step in reverse order. The unpairing function gives us two natural integers. These are mapped to two whole numbers, which are mapped to two logarithm values, which are then exponentiated with exp2() to recover the original numbers, with a bit of added work to get special values and the sign correct.

While logarithms are represented with a relative accuracy on the order of 10^-8, the expected maximum relative error in final results is on the order of 10^-5 due to the well-known error magnification property of exponentiation. Maximum relative error observed for a pack() / unpack() round-trip in extensive testing was 2.167e-5.

Below is my ISO C99 implementation of the algorithm together with a portion of my test framework. This should be portable to other programming languages with a modicum of effort.

#include 
#include 
#include 
#include 
#include 
#include 

#define SZUDZIK_ELEGANT_PAIRING  (1)
#define ROZSA_PETER_PAIRING      (2)
#define ROSENBERG_STRONG_PAIRING (3)
#define CHRISTOPH_MICHEL_PAIRING (4)

#define PAIRING_FUNCTION  (ROZSA_PETER_PAIRING)

/*
  https://groups.google.com/forum/#!original/comp.lang.c/qFv18ql_WlU/IK8KGZZFJx4J
  From: geo 
  Newsgroups: sci.math,comp.lang.c,comp.lang.fortran
  Subject: 64-bit KISS RNGs
  Date: Sat, 28 Feb 2009 04:30:48 -0800 (PST)

  This 64-bit KISS RNG has three components, each nearly
  good enough to serve alone.    The components are:
  Multiply-With-Carry (MWC), period (2^121+2^63-1)
  Xorshift (XSH), period 2^64-1
  Congruential (CNG), period 2^64
*/
static uint64_t kiss64_x = 1234567890987654321ULL;
static uint64_t kiss64_c = 123456123456123456ULL;
static uint64_t kiss64_y = 362436362436362436ULL;
static uint64_t kiss64_z = 1066149217761810ULL;
static uint64_t kiss64_t;
#define MWC64  (kiss64_t = (kiss64_x << 58) + kiss64_c, \
                kiss64_c = (kiss64_x >> 6), kiss64_x += kiss64_t, \
                kiss64_c += (kiss64_x < kiss64_t), kiss64_x)
#define XSH64  (kiss64_y ^= (kiss64_y << 13), kiss64_y ^= (kiss64_y >> 17), \
                kiss64_y ^= (kiss64_y << 43))
#define CNG64  (kiss64_z = 6906969069ULL * kiss64_z + 1234567ULL)
#define KISS64 (MWC64 + XSH64 + CNG64)

double uint64_as_double (uint64_t a)
{
    double r;
    memcpy (&r, &a, sizeof r);
    return r;
}

#define LOG2_BIAS    (1074.0)
#define CLAMP_LOW    (exp2(-LOG2_BIAS))
#define SCALE        (15993.5193125)
#define NAN_ENCODING (33554430)
#define INF_ENCODING (33554420)

/* check whether argument is an odd integer */
int is_odd_int (double a)
{
    return (-2.0 * floor (0.5 * a) + a) == 1.0;
}

/* compress double-precision number into an integer in (-2**25, +2**25) */
double compress (double x)
{
    double t;

    t = fabs (x);
    t = (t < CLAMP_LOW) ? CLAMP_LOW : t;
    t = rint ((log2 (t) + LOG2_BIAS) * SCALE);
    if (isnan (x)) t = NAN_ENCODING;
    if (isinf (x)) t = INF_ENCODING;
    return copysign (t, x);
}

/* expand integer in (-2**25, +2**25) into double-precision number */
double decompress (double x)
{
    double s, t;

    s = fabs (x);
    t = s / SCALE;
    if (s == (NAN_ENCODING)) t = NAN;
    if (s == (INF_ENCODING)) t = INFINITY;
    return copysign ((t == 0) ? 0 : exp2 (t - LOG2_BIAS), x);
}

/* map whole numbers to natural numbers. Here: (-2^25, +2^25) to [0, 2^26-2] */
double map_Z_to_N (double x)
{
    return (x < 0) ? (-2 * x - 1) : (2 * x);
}

/* Map natural numbers to whole numbers. Here: [0, 2^26-2] to (-2^25, +2^25) */
double map_N_to_Z (double x)
{
    return is_odd_int (x) ? (-0.5 * (x + 1)) : (0.5 * x);
}

#if PAIRING_FUNCTION == SZUDZIK_ELEGANT_PAIRING
/* Matthew Szudzik, "An elegant pairing function." In Wolfram Research (ed.) 
   Special NKS 2006 Wolfram Science Conference, pp. 1-12. 

   Here: map two natural numbers in [0, 2^26-2] to natural number in [0, 2^52),
   and vice versa
*/
double pair (double x, double y)
{
    return (x != fmax (x, y)) ? (y * y + x) : (x * x + x + y);
}

void unpair (double z, double *x, double *y)
{
    double sqrt_z = trunc (sqrt (z));
    double sqrt_z_diff = z - sqrt_z * sqrt_z;
    *x = (sqrt_z_diff < sqrt_z) ? sqrt_z_diff : sqrt_z;
    *y = (sqrt_z_diff < sqrt_z) ? sqrt_z : (sqrt_z_diff - sqrt_z);
}

#elif PAIRING_FUNCTION == ROZSA_PETER_PAIRING
/*
  Rozsa Peter, "Rekursive Funktionen" (1951), p. 44. Via David R. Hagen's blog,
  https://drhagen.com/blog/superior-pairing-function/

  Here: map two natural numbers in [0, 2^26-2] to natural number in [0, 2^52),
  and vice versa
*/
double pair (double x, double y)
{
    double mx = fmax (x, y);
    double mn = fmin (x, y);
    double sel = (mx == x) ? 0 : 1;
    return mx * mx + mn * 2 + sel;
}

void unpair (double z, double *x, double *y)
{
    double sqrt_z = trunc (sqrt (z));
    double sqrt_z_diff = z - sqrt_z * sqrt_z;
    double half_diff = trunc (sqrt_z_diff * 0.5);
    *x = is_odd_int (sqrt_z_diff) ? half_diff : sqrt_z;
    *y = is_odd_int (sqrt_z_diff) ? sqrt_z : half_diff;
}

#elif PAIRING_FUNCTION == ROSENBERG_STRONG_PAIRING
/*
  A. L. Rosenberg and H. R. Strong, "Addressing arrays by shells", 
  IBM Technical Disclosure Bulletin, Vol. 14, No. 10, March 1972,
  pp. 3026-3028.

  Arnold L. Rosenberg, "Allocating storage for extendible arrays," 
  Journal of the ACM, Vol. 21, No. 4, October 1974, pp. 652-670.
  Corrigendum, Journal of the ACM, Vol. 22, No. 2, April 1975, p. 308.

  Matthew P. Szudzik, "The Rosenberg-Strong Pairing Function", 2019
  https://arxiv.org/abs/1706.04129

  Here: map two natural numbers in [0, 2^26-2] to natural number in [0, 2^52),
  and vice versa
 */
double pair (double x, double y)
{
    double mx = fmax (x, y);
    return mx * mx + mx + x - y;
}

void unpair (double z, double *x, double *y)
{
    double sqrt_z = trunc (sqrt (z));
    double sqrt_z_diff = z - sqrt_z * sqrt_z;
    *x = (sqrt_z_diff < sqrt_z) ? sqrt_z_diff : sqrt_z;
    *y = (sqrt_z_diff < sqrt_z) ? sqrt_z : (2 * sqrt_z - sqrt_z_diff);
}
#elif PAIRING_FUNCTION == CHRISTOPH_MICHEL_PAIRING
/*
  Christoph Michel, "Enumerating a Grid in Spiral Order", September 7, 2016,
  https://cmichel.io/enumerating-grid-in-spiral-order. Via German Wikipedia,
  https://de.wikipedia.org/wiki/Cantorsche_Paarungsfunktion

  Here: map two natural numbers in [0, 2^26-2] to natural number in [0, 2^52),
  and vice versa
*/
double pair (double x, double y)
{
    double mx = fmax (x, y);
    return mx * mx + mx + (is_odd_int (mx) ? (x - y) : (y - x));
}

void unpair (double z, double *x, double *y)
{
    double sqrt_z = trunc (sqrt (z));
    double sqrt_z_diff = z - sqrt_z * (sqrt_z + 1);
    double min_clamp = fmin (sqrt_z_diff, 0);
    double max_clamp = fmax (sqrt_z_diff, 0);
    *x = is_odd_int (sqrt_z) ? (sqrt_z + min_clamp) : (sqrt_z - max_clamp);
    *y = is_odd_int (sqrt_z) ? (sqrt_z - max_clamp) : (sqrt_z + min_clamp);
}
#else
#error unknown PAIRING_FUNCTION
#endif

/* Lossy pairing function for double precision numbers. The maximum round-trip
   relative error is about 2.167e-5
*/
double pack (double a, double b)
{
    double c, p, q, s, t;

    p = compress (a);
    q = compress (b);
    s = map_Z_to_N (p);
    t = map_Z_to_N (q);
    c = pair (s, t);
    return c;
}

/* Unpairing function for double precision numbers. The maximum round-trip
   relative error is about 2.167e-5 */
void unpack (double c, double *a, double *b)
{
    double s, t, u, v;

    unpair (c, &s, &t);
    u = map_N_to_Z (s);
    v = map_N_to_Z (t);
    *a = decompress (u);
    *b = decompress (v);
}

int main (void)
{
    double a, b, c, ua, ub, relerr_a, relerr_b;
    double max_relerr_a = 0, max_relerr_b = 0; 

#if PAIRING_FUNCTION == SZUDZIK_ELEGANT_PAIRING
    printf ("Testing with Szudzik's elegant pairing function
");
#elif PAIRING_FUNCTION == ROZSA_PETER_PAIRING
    printf ("Testing with Rozsa Peter's pairing function
");
#elif PAIRING_FUNCTION == ROSENBERG_STRONG_PAIRING
    printf ("Testing with Rosenberg-Strong pairing function
");
#elif PAIRING_FUNCTION == CHRISTOPH_MICHEL_PAIRING
    printf ("Testing with C. Michel's spiral pairing function
");
#else
#error unkown PAIRING_FUNCTION
#endif

    do {
        a = uint64_as_double (KISS64);
        b = uint64_as_double (KISS64);

        c = pack (a, b);
        unpack (c, &ua, &ub);

        if (!isnan(ua) && !isinf(ua) && (ua != 0)) {
            relerr_a = fabs ((ua - a) / a);
            if (relerr_a > max_relerr_a) {
                printf ("relerr_a= %15.8e  a=% 23.16e  ua=% 23.16e
", 
                        relerr_a, a, ua);
                max_relerr_a = relerr_a;
            }
        }
        if (!isnan(ub) && !isinf(ub) && (ub != 0)) {
            relerr_b = fabs ((ub - b) / b);
            if (relerr_b > max_relerr_b) {
                printf ("relerr_b= %15.8e  b=% 23.16e  ub=% 23.16e
", 
                        relerr_b, b, ub);
                max_relerr_b = relerr_b;
            }
        }
    } while (1);
    return EXIT_SUCCESS;
}

Approximately reversible function from a pair of floats to a single float?

Answers (2)

Related Questions