Axel Jacobsen
Axel Jacobsen

Reputation: 148

How to convert [u64;N] to [u8;8*N]?

I am writing a compression algorithm in Rust.

I have a "conversion table" which maps input bytes to outputs (each output is a u64 which is the code, and a u8 which is the length of the code, which need not be a multiple of 8). For each input byte, I find the corresponding output and bit-shift the code into a large contiguous array of u64s. I then want to write this array of u64s to a file.

What would be the best way to convert the [u64; WRITE_CHUNK_SIZE] to a [u8; 8 * WRITE_CHUNK_SIZE]?

Is looping through [u64; WRITE_CHUNK_SIZE], calling to_be_bytes() on each u64, writing each byte of this to the [u8; 8*WRITE_CHUNK_SIZE] the best I can do?

I saw examples of align_to which required unsafe, which seems excessive just to convert data types. I also saw this which outputs a Vec. I think this just pushes the problem down the road, though, because I still have to convert that Vec<u8> into an [u8] to write to the file. Am I wrong?

Upvotes: 3

Views: 3360

Answers (1)

Kevin Reid
Kevin Reid

Reputation: 43743

The bytemuck crate provides functions for reinterpreting one type as another without use of unsafe, whenever this is safe because both types are “just bytes” (have no invalid values and no padding). Just call bytemuck::cast to convert between any integer array type, as long as both types are the same total size. (Or, bytemuck::cast_slice if you have a slice of integers or integer arrays.)

Such conversions are purely changing the type, and so they have no run-time cost. However, they do not reorder bytes to a specific endianness, so this exposes the native endianness of the machine you're running on. In your case, perhaps you can rearrange your table to solve this more efficientl than explicitly swapping the bytes. If you can't, then probably to_be_bytes in a loop is as good as it gets.

const WRITE_CHUNK_SIZE: usize = 4;

fn main() {
    let sixtyfours: [u64; WRITE_CHUNK_SIZE] = [1, 2, 3, 0x0102030405060708];
    let eights: [u8; 8 * WRITE_CHUNK_SIZE] = bytemuck::cast(sixtyfours);
    println!("{:?}", eights);
}

This prints (on a little-endian machine):

[1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 8, 7, 6, 5, 4, 3, 2, 1]

Upvotes: 6

Related Questions