Matt Thomas
Matt Thomas

Reputation: 5734

Splitting owned array into owned halves

I would like to divide a single owned array into two owned halves—two separate arrays, not slices of the original array. The respective sizes are compile time constants. Is there a way to do that without copying/cloning the elements?

let array: [u8; 4] = [0, 1, 2, 3];

let chunk_0: [u8; 2] = ???;
let chunk_1: [u8; 2] = ???;

assert_eq!(
  [0, 1],
  chunk_0
);
assert_eq!(
  [2, 3],
  chunk_1
);

Since it would amount to merely moving ownership of the elements, I have a hunch there should be a zero-cost abstraction for this. I wonder if I could do something like this with some clever use of transmute and forget. But there are a lot of scary warnings in the docs for those functions.

My main motivation is to operate on large arrays in memory without as many mem copies. For example:

let raw = [0u8; 1024 * 1024];

let a = u128::from_be_array(???); // Take the first 16 bytes
let b = u64::from_le_array(???); // Take the next 8 bytes
let c = ...

The only way I know to accomplish patterns like the above is with lots of mem copying which is redundant.

Upvotes: 8

Views: 1884

Answers (3)

Kevin Reid
Kevin Reid

Reputation: 43743

The bytemuck library provides a safe wrapper for re-interpretation of any data type that is “plain old data” (more precisely: all possible byte sequences of the right size are valid values), as long as the input and output are the same size (or the input is a slice whose byte-length is divisible by the output type's size). This is equivalent to a transmute solution but without needing to write any any new unsafe code.

let array: [u8; 4] = [0, 1, 2, 3];

let [chunk_0, chunk_1]: [[u8; 2]; 2] = bytemuck::cast(array);

If you'd like to avoid using additional libraries, I recommend the try_into() approach that's already been posted.

Upvotes: 2

lkolbly
lkolbly

Reputation: 1240

use std::convert::TryInto;

let raw = [0u8; 1024 * 1024];
    
let a = u128::from_be_bytes(raw[..16].try_into().unwrap()); // Take the first 16 bytes
let b = u64::from_le_bytes(raw[16..24].try_into().unwrap()); // Take the next 8 bytes

In practice, I've found the compiler is pretty smart about optimizing this. With optimizations, it will do the above in a single copy (directly into the register that holds a or b, respectively). As an example, according to godbolt, this:

use std::convert::TryInto;

pub fn cvt(bytes: [u8; 24]) -> (u128, u64) {
    let a = u128::from_be_bytes(bytes[..16].try_into().unwrap()); // Take the first 16 bytes
    let b = u64::from_le_bytes(bytes[16..24].try_into().unwrap()); // Take the next 8 bytes
    (a, b)
}

with -C opt-level=3 compiles into:

example::cvt:
        mov     rax, qword ptr [rdi + 8]
        bswap   rax
        mov     rdx, qword ptr [rdi]
        bswap   rdx
        mov     rcx, qword ptr [rdi + 16]
        ret

It's optimized out any extra copies, calling the try_into method, possibly panicking, et cetera.

Upvotes: 6

Netwave
Netwave

Reputation: 42678

You can use std::mem:transmute (warning: unsafe!):

fn main() {
    let array: [u8; 4] = [0, 1, 2, 3];

    let [chunk_0, chunk_1]: [[u8; 2]; 2] =
        unsafe { std::mem::transmute::<[u8; 4], [[u8; 2]; 2]>(array) };

    assert_eq!([0, 1], chunk_0);
    assert_eq!([2, 3], chunk_1);
}

Playground

Upvotes: 8

Related Questions