Matteo Monti
Matteo Monti

Reputation: 8980

Most idiomatic way to read a range of bytes from a file

I have a file, say myfile. Using Rust, I would like to open myfile, and read bytes N to M into a Vec, say myvec. What is the most idiomatic way to do so? Naively, I thought of using bytes(), then skip, take and collect, but that sounds so inefficient.

Upvotes: 5

Views: 9738

Answers (2)

Tim
Tim

Reputation: 8186

The existing answer works, but it reads the entire block that you're after into a Vec in memory. If the block you're reading out is huge or you have no use for it in memory, you ideally need an io::Read which you can copy straight into another file or pass into another api.

If your source implements Read + Seek then you can seek to the start position and then use Read::take to only read for a specific number of bytes.

use std::{fs::File, io::{self, Read, Seek, SeekFrom}};

let start = 20;
let length = 100;

let mut input = File::open("input.bin")?;

// Seek to the start position
input.seek(SeekFrom::Start(start))?;

// Create a reader with a fixed length    
let mut chunk = input.take(length);

let mut output = File::create("output.bin")?;

// Copy the chunk into the output file
io::copy(&mut chunk, &mut output)?;

Upvotes: 2

Thomas
Thomas

Reputation: 182063

The most idiomatic (to my knowledge) and relatively efficient way:

let start = 10;
let count = 10;

let mut f = File::open("/etc/passwd")?;
f.seek(SeekFrom::Start(start))?;
let mut buf = vec![0; count];
f.read_exact(&mut buf)?;

You indicated in the comments that you were concerned about the overhead of zeroing the memory before reading into it. Indeed there is a nonzero cost to this, but it's usually negligible compared to the I/O operations needed to read from a file, and the advantage is that your code remains 100% sound. But for educational purposes only, I tried to come up with an approach that avoids the zeroing.

Unfortunately, even with unsafe code, we cannot safely pass an uninitialized buffer to read_exact because of this paragraph in the documentation (emphasis mine):

No guarantees are provided about the contents of buf when this function is called, implementations cannot rely on any property of the contents of buf being true. It is recommended that implementations only write data to buf instead of reading its contents.

So it's technically legal for File::read_exact to read from the provided buffer, which means we cannot legally pass uninitialized data here (using MaybeUninit).

Upvotes: 11

Related Questions