Stranger6667
Stranger6667

Reputation: 516

How to check whether a u64 number larger than 2 ^ 54 is divisible by f64 with the minimal precision loss?

For a u64 number less than 2 ^ 54 it can be done without much of precision loss by casting to f64:

((6 as f64) % 1.5) < f64::EPSILON

For larger numbers, there will be a significant precision loss:

1u64 << 63           // 9223372036854775808
(1u64 << 63) as f64  // 9223372036854776000

and divisibility will be checked for a different number.

Context: JSONSchema's multipleOf keyword implementation

Question: What is the most efficient way to check divisibility for u64 / i64 numbers that do not fit the f64 mantissa size (f64::MANTISSA_DIGITS which is 53)?

Upvotes: 3

Views: 760

Answers (1)

Eric Postpischil
Eric Postpischil

Reputation: 222942

Here is a solution given that u is some integer and x is a finite non-zero IEEE-754 binary64 number with which we do IEEE-754 arithmetic. x is presumed to represent one specific number, as specified by IEEE-754, and prior rounding errors occurring while obtaining x are not considered. This answer speaks to the mathematics involved, not to the Rust semantics, as I am unfamiliar with Rust.

First, find the representation of x = F • 2E, where F is an odd integer and E is an integer. A simple method for this is:

  • Set F to x and E to 0.
  • While F is not an integer, multiply F by two and subtract one from E.
  • While F is even, divide F by two and add one to E.

All of the above operations can be performed in IEEE-754 arithmetic with no rounding errors. If Rust offers a method to separate the significand and exponent of a floating-point number, akin to C’s frexp function, then incorporating it into the above can improve efficiency.

Now consider whether u is a multiple of x = F • 2E. By definition, it is if and only if there is an integer k such that u = kF • 2E. We will see this is so if and only if u is a multiple of F and is a multiple of 2E, and each of these can be tested.

If 2E is an integer (E is non-negative) and such a k exists, then u is a multiple of F and is a multiple of 2E. Conversely, if u is not a multiple of F or is not a multiple of 2E, then no such k exists (by way of the fundamental theorem of arithmetic).

F is necessarily within bounds of the requested integer format (it is at most 53 bits), and we assume F can be converted to that format. Then divisibility of u by F can be tested. If 2E exceeds the maximum value of the integer format in which u is represented, then u is not a multiple of 2E. Otherwise, 2E can be converted to the format, and the divisibility of u by 2E can be tested.

If 2E is not an integer (E is negative), then, if the required k exists (so u is a multiple of F), it is a multiple of 2E. Conversely, if k is not a multiple of 2E, then kF • 2E is not an integer, so it cannot equal u. Thus u is a multiple of x if and only if u is a multiple of F.

Upvotes: 5

Related Questions