CryptoKitty
CryptoKitty

Reputation: 735

Why is performance of 64bit multiplication comparable to 32bit multiplication?

I am measuring performance of rust when two 64 bits numbers are multiplied vs when two 32 bits numbers are multiplied. Recall that result for 64 multiplication is 128 number and result for 32 bits multiplication is 64 bit number. I expected 64 bits multiplication to at least 2x slower than the other. Mainly because there is no native 128 bits support and to multiply two 64 bits numbers you divide them into 32 bits hi and lows. However when I ran the test, it turns out both performs similar.

Here is the script I have used:

fn main() {
    test_64_mul();
    test_32_mul();
}

fn test_64_mul() {
    let test_num: u64 = 12345678653435363454;
    use std::time::Instant;
    let mut now = Instant::now();
    let mut elapsed = now.elapsed();
    for _ in 1..2000 {
        now = Instant::now();
        let _prod = test_num as u128 * test_num as u128;
        elapsed = elapsed + now.elapsed();
    }
    println!("Elapsed For 64: {:.2?}", elapsed);
}

fn test_32_mul() {
    let test_num: u32 = 1234565755;
    use std::time::Instant;
    let mut now = Instant::now();
    let mut elapsed = now.elapsed();
    for _ in 1..2000 {
        now = Instant::now();
        let _prod = test_num as u64 * test_num as u64;
        elapsed = elapsed + now.elapsed();
    }
    println!("Elapsed For 32: {:.2?}", elapsed);
}

Output of after running this code is

Elapsed For 64: 25.58µs

Elapsed For 32: 26.08µs

I am using MacBook Pro with M1 chip and rust version 1.60.0

Upvotes: 3

Views: 267

Answers (1)

Chayim Friedman
Chayim Friedman

Reputation: 71605

Because the compiler has noticed you don't use the result, and eliminated the multiplication completely.

See the diff at https://rust.godbolt.org/z/5sjze7Mbv.

You should use something like std::hint::black_box(), or much better, a benchmarking framework like criterion.

Also, the overhead of creating a new Instant every time is likely much higher than of the multiplication itself. Like I said, use a benchmarking framework.

As noted by @StephenC, it is also unlikely that your clock resolution is small enough to measure one multiplication.

Upvotes: 7

Related Questions