Why is there no arbitrary-sized binary integer type in Rust?

Question

Rust has binary literals, a binary formatter, and a host of integer types, but no explicit binary numeric type.

'Almost-binary' integers

It's true that the expected implementation of an unsigned integer is a big/little-endian binary number in general purpose machines. However, that's far removed from the syntax of a high-level language. For example, if I have an eight-bit binary number 0000 0101 that I want to treat syntactically like a primitive numeric type, I have two problems: (1) the character representation of the number, and (2) the type declaration for the number. If I decide to stick with u8, I have to add a layer of string operations (in Rust) or a layer of vector operations (in MATLAB, for example) where numbers will be displayed or declared literally, and I have to ensure that the binary representation is converted to its equivalent in u8. In this situation, there's no way to make the direct statement 0000 0101 + 0000 0111 without this machinery bubbling up to the syntactic level, and that's just for binary types whose sizes happen to line up with integer types.

A 'true' binary type

An example would be, say, a hypothetical type b3, which is a 3-bit binary number, supporting the appropriate mathematical operations in its field. At a minimum, those operations would be arithmetic, closed over the type b3, of course. (The one defining the type would have to define a convention for how that closure is achieved in practice, e.g., by wrapping or asserting that the result of an operation that can't be expressed in b3 is not defined.)

A binary type like this could be declared as such and then used syntactically the same way as any other numeric type. Thus, 101 + 001 == 110, without the need to deploy bitwise operators, among other added requirements.

Under the hood

If these operations seem prosaic in a programming language that is already expected to have binary representations at its foundation, note that there are subtleties in implementing finite field arithmetic in C-like languages:

/* Multiply two numbers in the GF(2^8) finite field defined 
 * by the polynomial x^8 + x^4 + x^3 + x + 1 = 0
 * using the Russian Peasant Multiplication algorithm
 * (the other way being to do carry-less multiplication followed by a modular reduction)
 */
uint8_t gmul(uint8_t a, uint8_t b) {
    uint8_t p = 0; /* the product of the multiplication */
    while (b) {
        if (b & 1) /* if b is odd, then add the corresponding a to p (final product = sum of all a's corresponding to odd b's) */
            p ^= a; /* since we're in GF(2^m), addition is an XOR */

        if (a & 0x80) /* GF modulo: if a >= 128, then it will overflow when shifted left, so reduce */
            a = (a << 1) ^ 0x11b; /* XOR with the primitive polynomial x^8 + x^4 + x^3 + x + 1 (0b1_0001_1011) – you can change it but it must be irreducible */
        else
            a <<= 1; /* equivalent to a*2 */
        b >>= 1; /* equivalent to b // 2 */
    }
    return p;
}

Why bother?

A Rust type with trait implementations accomplishing the above would collapse all of that down to Mul for b8, which seems to me to be a great feature about Rust. Being able to refer to characteristics of a b8 number using a more formal and standard interface than bitmasks and shifts would also seem to be a useful thing Rust could offer here.

What are the reasons why no such types are present anywhere in the core or crates?

bright-star · Accepted Answer

In good faith, and perhaps so we can all agree nobody here is crazy (?), I implemented a crate that is an attempt at capturing the semantics of finite fields in Rust independent of the underlying expectations of the language or hardware. I have to warn you that it is neither rigorously tested nor efficiently implemented, but it compiles and so does its examples.

It offers the following semantics:

If you can think of a finite field as either a set of coefficients of a restricted polynomial, or as a vector of p-adic numbers, you can define a type that will store the coefficients as a vector that quacks like a number. For example, a field of two-digit binary numbers can be generated with the following macro:
```
#![allow(non_camel_case_types)]
#[macro_use] extern crate finite_fields;

binary_type! { b2, 2 }
```
That macro expands into the implementation of a newtype struct carrying an array:
```
/// A binary number ($fieldwidth digits).
#[derive(Clone, Copy, PartialEq)]
pub struct $tyname([b1; $fieldwidth]);

impl $tyname {
  pub fn new(vals: [b1; $fieldwidth]) -> $tyname {
    $tyname(vals)
  }
} // ...
```

The types defined admit the usual arithmetic operations with overflow errors on saturation and divide by zero errors. Specifically, I implemented Ordering, Add, Sub, Mul, Div, BitXor, Index, and IndexMut on "unit" n-ary types in macros, and then used those as the digits of larger macro-generated n-ary numbers.

/// Arithmetic addition with overflow error.
impl Add for $tyname {
  type Output = Result<$tyname, OverflowError>;
  fn add(self, other: $tyname) -> Result<$tyname, OverflowError> {
    let sum = self.0 + other.0;
    if sum > $arity - 1 {
      Err(OverflowError::Default { arg1: self.to_string(),
                                   arg2: other.to_string() })
    } else {
      Ok($tyname(sum as $storsize))
    }
  }
}

Finite fields of any arity can be defined, but the storage type has to be specified by the user to meet with the standard types used by Rust:
```
/// Creates a ternary type named `t2`, with a unit type named `t1`, storing each
/// digit in a `u8`, with two digits.
nary_type! { t2, t1, 3, u8, 2 }
```
This was the locus of my confusion. The answer as shown by this crate's implementation is that yes, you can raise arbitrary finite field semantics to the level of "natural" (i.e., base-10, base-2, base-8, and base-16) numeric fields embedded into the language and hardware (i.e., you can pretend they're regular numeric types and get the Rustic checks you expect, if you believe newtypes are types), but you still need to pay the piper in the form of storage overhead (and probably incurable calculation inefficiency). I don't think I'm alone in being caught up short by that ontological fault line between discrete math and applied CS, but I'm not sure it matters any more.

At any rate, you can do totally goofy things with the same basic macros, like work in base-7:

/// Creates a septary type named `s3`, with a unit type named `s1`, each digit
/// stored in a `u8`, with three digits.
nary_type! { s3, s1, 7, u8, 3 }

Hooray. Let's all get drunk and forget the whole thing happened.

Why is there no arbitrary-sized binary integer type in Rust?

'Almost-binary' integers

A 'true' binary type

Under the hood

Why bother?

Answers (1)

Related Questions