Is it always preferable to pass in a mutable reference vs creating and returning an owned value?

Question

Coming to Rust from dynamic languages like Python, I'm not used to the programming pattern where you provide a function with a mutable reference to an empty data structure and that function populates it. A typical example is reading a file into a String:

let mut f = File::open("file.txt").unwrap();
let mut contents = String::new();
f.read_to_string(&mut contents).unwrap();

To my Python-accustomed eyes, an API where you just create an owned value within the function and move it out as a return value looks much more intuitive / ergonomic / what have you:

let mut f = File::open("file.txt").unwrap();
let contents = f.read_to_string().unwrap();

Since the Rust standard library takes the former road, I figure there must be a reason for that.

Is it always preferable to use the reference pattern? If so, why? (Performance reasons? What specifically?) If not, how do I spot the cases where it might be beneficial? Is it mostly useful when I want to return another value in addition to populating the result data structure (as in the first example above, where .read_to_string() returns the number of bytes read)? Why not use a tuple? Is it simply a matter of personal preference?

DK. · Accepted Answer

If read_to_string wanted to return an owned String, this means it would have to heap allocate a new String every time it was called. Also, because Read implementations don't always know how much data there is to be read, it would probably have to incrementally re-allocate the work-in-progress String multiple times. This also means every temporary String has to go back to the allocator to be destroyed.

This is wasteful. Rust is a system programming language. System programming languages abhor waste.

Instead, the caller is responsible for allocating and providing the buffer. If you only call read_to_string once, nothing changes. If you call it more than once, however, you can re-use the same buffer multiple times without the constant allocate/resize/deallocate cycle. Although it doesn't apply in this specific case, similar interfaces can be design to also support stack buffers, meaning in some cases you can avoid heap activity entirely.

Having the caller pass the buffer in is strictly more flexible than the alternative.

Is it always preferable to pass in a mutable reference vs creating and returning an owned value?

Answers (1)

Related Questions