Reputation: 245
I'm new to Rust and I'm trying to come up with a simple backup program. In a first step, the files are broken down into blocks of variable length (via content-defined chunking).
To do this, I have to read the file byte by byte. Unfortunately, I find that the process is terribly slow. With dd
I can read at up to 350 MiB / s. Nevertheless, I only get about 45 MiB / s with the following Rust code. (I left out all the chunking stuff there.)
The file I am reading is around 7.7 GiB in size.
// main.rs
use std::fs::File;
use std::io::BufReader;
use std::io::{Read, Bytes};
use std::time::{Instant, Duration};
fn main() {
let file = File::open("something_big.tar").expect("Cannot read file.");
let mut buf = BufReader::new(file);
let mut x = 0u8;
let mut num_bytes = 0usize;
let t1 = Instant::now();
for b in buf.bytes() {
match b {
Ok(b) => {
x += b;
num_bytes += 1;
// chunking stuff omitted
},
Err(_) => panic!("I/O Error")
}
}
let dur = t1.elapsed().as_secs_f64();
let mut num_bytes = (num_bytes as f64) / 1_048_576f64;
println!("RESULT: {}", x);
println!("Read speed: {:.1} MiB / s", num_bytes / dur);
}
Question: What is a better way to quickly iterate through the bytes of a file with Rust? And what is wrong with my code?
I know that maybe I could use the memmap
crate or something like this – but: I don't want to do that.
Upvotes: 3
Views: 5032
Reputation: 2618
I'm not sure why this is happening but I'm seeing much faster times when manually read()
ing from the BufReader
. With the 512 byte array below, I'm seeing ~2700MiB/s, with a single byte array it's around 300 MiB/s.
The Bytes
iterator apparently induces some overhead, this implementation is more or less copy pasted from its IntoIterator
implementation.
use std::fs::File;
use std::io::{BufReader, ErrorKind};
use std::io::Read;
use std::time::Instant;
fn main() {
let file = File::open("some-3.3gb-file")
.expect("Cannot read file.");
let mut buf = BufReader::new(file);
let mut x = 0u8;
let mut num_bytes = 0usize;
let t1 = Instant::now();
let mut bytes = [0; 512];
loop {
match buf.read(&mut bytes) {
Ok(0) => break,
Ok(n) => {
for i in 0..n {
num_bytes += 1;
x += bytes[i];
}
}
Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
Err(e) => panic!("{:?}", e),
};
}
let dur = t1.elapsed().as_secs_f64();
let mut num_bytes = (num_bytes as f64) / 1_048_576f64;
println!("RESULT: {}", x);
println!("Read speed: {:.1} MiB / s for {}", num_bytes / dur, num_bytes);
}
Upvotes: 6