Reputation: 3827
I'm creating a number type that uses arrays to store the numbers. To implement the trait One
, I find myself writing code like this:
fn one() -> Self {
let mut ret_array = [0; N];
ret_array[0] = 1;
Self(ret_array)
}
Is there an alternative way to initialize an array with one non-zero element?
Upvotes: 2
Views: 774
Reputation: 22838
I don't think so, no.
But the Rust compiler understands what you are trying to achieve and optimizes it accordingly:
pub fn one<const N: usize>() -> [i32; N] {
let mut ret_array = [0; N];
ret_array[0] = 1;
ret_array
}
pub fn one_with_length_5() -> [i32; 5] {
one()
}
example::one_with_length_5:
mov rax, rdi
xorps xmm0, xmm0
movups xmmword ptr [rdi + 4], xmm0
mov dword ptr [rdi], 1
ret
xorps xmm0, xmm0
sets the 16-byte (or 4-int) SSE register xmm0
to [0,0,0,0].movups xmmword ptr [rdi + 4], xmm0
copies all 4 ints of the xmm0
register to the location [rdi + 4]
, which is the elements 1, 2, 3 and 4 of ret_array
.mov dword ptr [rdi], 1
moves the value 1
to the first element of the ret_array
.ret_array
is at the location of [rdi]
, [rdi + 4]
is the element at position ret_array[1]
, [rdi + 8]
is the element at position ret_array[2]
, etc.As you can see, it only initializes the other four values with 0, and then sets the first value to 1. The first value does not get written twice.
If you set N
to e.g. 8
, it does actually write the value twice:
example::one_with_length_8:
mov rax, rdi
xorps xmm0, xmm0
movups xmmword ptr [rdi + 16], xmm0
movups xmmword ptr [rdi + 4], xmm0
mov dword ptr [rdi], 1
ret
Interestingly, it doesn't actually write the value [0]
twice, but the value [4]
. It once writes [1,2,3,4]
, and then [4,5,6,7]
, then the one to [0]
.
But that's because this is the fastest way to do it. It stores 4 ints of zeros in the SSE registers and then zero-initializes the vector 4 ints at a time. Writing an int twice is faster than initializing the other values without the help of 4-int SSE commands.
This would even happen if you initialized it completely manually:
pub fn one_with_length_8() -> [i32; 8] {
[1,0,0,0,0,0,0,0]
}
example::one_with_length_8:
mov rax, rdi
mov dword ptr [rdi], 1
xorps xmm0, xmm0
movups xmmword ptr [rdi + 4], xmm0
movups xmmword ptr [rdi + 16], xmm0
ret
You can see the order is different, but the instructions are identical.
Upvotes: 2