LittleBigTech
LittleBigTech

Reputation: 157

Reading from file at different offsets using Rust

I am working on a project that involves reading different information from a file at different offsets.

Currently, I am using the following code:

// ------------------------ SECTORS PER CLUSTER ------------------------

// starts at 13
opened_file.seek(SeekFrom::Start(13)).unwrap();
let aux: &mut [u8] = &mut [0; 1];
let _buf = opened_file.read_exact(aux);

// ------------------------ RESERVED SECTORS ------------------------

// starts at 14
opened_file.seek(SeekFrom::Start(14)).unwrap();
let aux: &mut [u8] = &mut [0; 2];
let _buf = opened_file.read_exact(aux);

But as you can see, I need to create a new buffer of the size I want to read every time. I can't specify it directly as a parameter of the function.

I created a struct but I could not make a struct of all the different pieces of data I wanted. For example:

struct FileStruct {
    a1: &mut [u8] &mut [0; 1],
    a2: &mut [u8] &mut [0; 2],
}

Which are the types that are required for the read_exact method to work?

Is there a more effective way to read information from different offsets of a file without having to repeatedly copy-paste these lines of code for every piece of information I want to read from the file? Some sort of function, Cursor, or Vector to easily move around the offset? And a way to write this info into struct fields?

Upvotes: 4

Views: 5609

Answers (3)

Matthias Braun
Matthias Braun

Reputation: 34303

Also building on Aplet123's answer, the following function seek_read doesn't require to know how many bytes to read at compile time, since it uses a Vector instead of a byte slice:

// Starting at `offset`, reads the `amount_to_read` from `reader`.
// Returns the bytes as a vector.
fn seek_read(
    reader: &mut (impl Read + Seek),
    offset: u64,
    amount_to_read: usize,
) -> Result<Vec<u8>> {
    // A buffer filled with as many zeros as we'll read with read_exact
    let mut buf = vec![0; amount_to_read];
    reader.seek(SeekFrom::Start(offset))?;
    reader.read_exact(&mut buf)?;
    Ok(buf)
}

Here are some tests to demonstrate how seek_read behaves:

use std::io::Cursor;
#[test]
fn seek_read_works() {
    let bytes = b"Hello world!";
    let mut reader = Cursor::new(bytes);

    assert_eq!(seek_read(&mut reader, 0, 2).unwrap(), b"He");
    assert_eq!(seek_read(&mut reader, 1, 4).unwrap(), b"ello");
    assert_eq!(seek_read(&mut reader, 6, 5).unwrap(), b"world");
    assert_eq!(seek_read(&mut reader, 2, 0).unwrap(), b"");
}

#[test]
#[should_panic(expected = "failed to fill whole buffer")]
fn seek_read_beyond_buffer_fails() {
    let mut reader = Cursor::new(b"Hello world!");
    seek_read(&mut reader, 6, 99).unwrap();
}

#[test]
#[should_panic(expected = "failed to fill whole buffer")]
fn start_seek_reading_beyond_buffer_fails() {
    let mut reader = Cursor::new(b"Hello world!");
    seek_read(&mut reader, 99, 1).unwrap();
}

Upvotes: 1

Masklinn
Masklinn

Reputation: 42227

This is a complementary answer to Aplet123's: it's not quite clear that you must store the bytes as is into a structure, so you can also allocate one buffer (as a fixed-size array) and reuse it with the correctly sized slice e.g.

let mut buf = [0u8;16];
opened_file.read_exact(&mut buf[..4])?; // will read 4 bytes
// do thing with the first 4 bytes
opened_file.read_exact(&mut buf[..8])?; // will read 8 bytes this time
// etc...

You could also use the byteorder crate, which lets you directly read numbers or sequences of numbers. It basically just does the unrelying "create stack buffer of the right size; read; decode" for you.

That's especially useful because it looks a lot like "SECTORS PER CLUSTER" should be a u8 and "RESERVED SECTORS" should be a u16. With byteorder you can straight read_16() or read_u8().

Upvotes: 1

Aplet123
Aplet123

Reputation: 35512

The easiest way is to have a struct of owned arrays, then seek and read into the struct.

use std::io::{self, prelude::*, SeekFrom};

#[derive(Debug, Clone, Default)]
struct FileStruct {
    a1: [u8; 1],
    a2: [u8; 2],
}

fn main() -> io::Result<()> {
    let mut file_struct: FileStruct = Default::default();
    let mut opened_file = unimplemented!(); // open file somehow
    opened_file.seek(SeekFrom::Start(13))?;
    opened_file.read_exact(&mut file_struct.a1)?;
    opened_file.seek(SeekFrom::Start(14))?;
    opened_file.read_exact(&mut file_struct.a2)?;
    println!("{:?}", file_struct);
    Ok(())
}

Playground link

This is still decently repetitive, so you can make a seek_read function to reduce the repetition:

use std::io::{self, prelude::*, SeekFrom};

#[derive(Debug, Clone, Default)]
struct FileStruct {
    a1: [u8; 1],
    a2: [u8; 2],
}

fn seek_read(mut reader: impl Read + Seek, offset: u64, buf: &mut [u8]) -> io::Result<()> {
    reader.seek(SeekFrom::Start(offset))?;
    reader.read_exact(buf)?;
    Ok(())
}

fn main() -> io::Result<()> {
    let mut file_struct: FileStruct = Default::default();
    let mut opened_file = unimplemented!(); // open file somehow
    seek_read(&mut opened_file, 13, &mut file_struct.a1)?;
    seek_read(&mut opened_file, 14, &mut file_struct.a2)?;
    println!("{:?}", file_struct);
    Ok(())
}

Playground link

The repetition can be lowered even more by using a macro:

use std::io::{self, prelude::*, SeekFrom};

#[derive(Debug, Clone, Default)]
struct FileStruct {
    a1: [u8; 1],
    a2: [u8; 2],
}

macro_rules! read_offsets {
    ($file: ident, $file_struct: ident, []) => {};
    ($file: ident, $file_struct: ident, [$offset: expr => $field: ident $(, $offsets: expr => $fields: ident)*]) => {
        $file.seek(SeekFrom::Start($offset))?;
        $file.read_exact(&mut $file_struct.$field)?;
        read_offsets!($file, $file_struct, [$($offsets => $fields),*]);
    }
}

fn main() -> io::Result<()> {
    let mut file_struct: FileStruct = Default::default();
    let mut opened_file = unimplemented!(); // open file somehow
    read_offsets!(opened_file, file_struct, [13 => a1, 14 => a2]);
    println!("{:?}", file_struct);
    Ok(())
}

Playground link

Upvotes: 4

Related Questions