Thorkil Værge
Thorkil Værge

Reputation: 3827

How to initialize array with one non-zero value

I'm creating a number type that uses arrays to store the numbers. To implement the trait One, I find myself writing code like this:

    fn one() -> Self {
        let mut ret_array = [0; N];
        ret_array[0] = 1;
        Self(ret_array)
    }

Is there an alternative way to initialize an array with one non-zero element?

Upvotes: 2

Views: 774

Answers (1)

Finomnis
Finomnis

Reputation: 22838

I don't think so, no.

But the Rust compiler understands what you are trying to achieve and optimizes it accordingly:

pub fn one<const N: usize>() -> [i32; N] {
    let mut ret_array = [0; N];
    ret_array[0] = 1;
    ret_array
}


pub fn one_with_length_5() -> [i32; 5] {
    one()
}
example::one_with_length_5:
        mov     rax, rdi
        xorps   xmm0, xmm0
        movups  xmmword ptr [rdi + 4], xmm0
        mov     dword ptr [rdi], 1
        ret
  • xorps xmm0, xmm0 sets the 16-byte (or 4-int) SSE register xmm0 to [0,0,0,0].
  • movups xmmword ptr [rdi + 4], xmm0 copies all 4 ints of the xmm0 register to the location [rdi + 4], which is the elements 1, 2, 3 and 4 of ret_array.
  • mov dword ptr [rdi], 1 moves the value 1 to the first element of the ret_array.
  • ret_array is at the location of [rdi], [rdi + 4] is the element at position ret_array[1], [rdi + 8] is the element at position ret_array[2], etc.

As you can see, it only initializes the other four values with 0, and then sets the first value to 1. The first value does not get written twice.


Small remark

If you set N to e.g. 8, it does actually write the value twice:

example::one_with_length_8:
        mov     rax, rdi
        xorps   xmm0, xmm0
        movups  xmmword ptr [rdi + 16], xmm0
        movups  xmmword ptr [rdi + 4], xmm0
        mov     dword ptr [rdi], 1
        ret

Interestingly, it doesn't actually write the value [0] twice, but the value [4]. It once writes [1,2,3,4], and then [4,5,6,7], then the one to [0]. But that's because this is the fastest way to do it. It stores 4 ints of zeros in the SSE registers and then zero-initializes the vector 4 ints at a time. Writing an int twice is faster than initializing the other values without the help of 4-int SSE commands.

This would even happen if you initialized it completely manually:

pub fn one_with_length_8() -> [i32; 8] {
    [1,0,0,0,0,0,0,0]
}
example::one_with_length_8:
        mov     rax, rdi
        mov     dword ptr [rdi], 1
        xorps   xmm0, xmm0
        movups  xmmword ptr [rdi + 4], xmm0
        movups  xmmword ptr [rdi + 16], xmm0
        ret

You can see the order is different, but the instructions are identical.

Upvotes: 2

Related Questions