misterx527
misterx527

Reputation: 97

Why does Rust not allow multiple mutable borrows when it's safe?

I'm having difficulty understanding why Rust's borrow checker does not allow multiple mutable borrows when it is safe to do so.

Let's give an example:

fn borrow_mut(s : &mut String) {
    s.push_str(" world!");
    println!("{}", s);
}

fn main() {
    let mut s = String::from("hello");
    let rs : &mut String = &mut s;

    // second mutable borrow
    borrow_mut(&mut s);

    println!("{rs}");
}

This code fails to compile with the following message:

error[E0499]: cannot borrow `s` as mutable more than once at a time
  --> main.rs:11:16
   |
8  |     let rs : &mut String = &mut s;
   |                            ------ first mutable borrow occurs here
...
11 |     borrow_mut(&mut s);
   |                ^^^^^^ second mutable borrow occurs here
12 | 
13 |     println!("{rs}");
   |                -- first borrow later used here

rs points on the variable of type String in the stack frame. String contains pointer on memory in the heap. So even if the string reallocates its data in borrow_mut(), both pointers are still valid, so this code should be safe.

Could someone explain the reason of why the borrow checker prevents multiple mutable borrows even when it's safe?

Upvotes: 2

Views: 1278

Answers (3)

Jason Orendorff
Jason Orendorff

Reputation: 45086

Here are two more non-thread-related reasons that mut is exclusive. Or maybe the second is just the first reason in disguise. You decide.

  1. Sometimes mutating data invalidates pointers into the affected data. For example, growing, shrinking, or clearing a Vec invalidates any pointers you've got to the elements. The elements simply don't exist at that memory address anymore; the pointers are invalid and must not be used! Or, if an enum value contains data, assigning a different variant to that enum drops all the old data, so it any pointers to that stuff are invalidated.

    Using an invalidated pointer is undefined behavior in C and C++ (Rust too, if you're using unsafe pointers). But in Rust, no references are ever invalidated, exactly because there are never any other references to data that's being mutated.

  2. A common C++ gotcha when implementing operator= is forgetting to handle the case where an object is assigned to itself. The same gotcha applies to any other method that modifies this and also takes an argument of the same type. In Rust, it never happens: you can't have both &mut self and another argument that refers to the same value.

    This one doesn't always implicate memory safety (when it does, it's usually some variation on reason 1), but it's one less weird possibility for programmers to have to remember.

Rust would have to be very careful in allowing non-exclusive mut references, to ensure undefined behavior can still be ruled out in safe code. So far I guess that has not been attempted.

Upvotes: 0

Jason Orendorff
Jason Orendorff

Reputation: 45086

It is a very deep rule of Rust that mut means "exclusive". More than just thread safety relies on this!

Very roughly speaking, the Rust compiler has two parts:

  • The Rust "frontend", which includes the borrow checker and translates your Rust code to an assembly-like language called LLVM IR
  • LLVM, which translates LLVM IR to machine code

The frontend tells LLVM that mut references are exclusive. That is, the frontend promises that while a mut reference to a value exists, no other pointers will be used to access that value. The frontend can make this promise because it borrow-checked your code.

LLVM has features like noalias and alias.scope metadata specifically to allow compilers to provide this kind of promise. It unlocks powerful optimizations in LLVM. But if the promise is broken, those optimizations could go powerfully wrong. For example, LLVM might legitimately reason as follows:

  • First, inline the call to <String as Display>::fmt inside the println on line 13.

  • The inlined code only needs two parts of the String: the length and the pointer to the characters.

  • Between the point where we set rs on line 8, and the point where it's used on line 13, rs is not used to modify the String.

  • And rs is exclusive; therefore nobody else is modifying it either. The String is not changed. (Note that this conclusion is wrong; the program does modify that string. So after this point LLVM's reasoning is going to be increasingly off-the-rails.)

  • Therefore we don't have to wait until line 13 to read the pointer and length from *rs. We can read them at any point in that range. It'll work because those fields aren't going to change.

  • Memory accesses will finish faster if you get started earlier. So let's read as early as possible. Move those reads to line 8.

  • Actually, if we're doing that, the length is guaranteed to be 5. So we don't need to read that at all.

Of course, LLVM doesn't actually "reason" like a human would, but it consists of multiple optimizer passes, each of which incrementally tweaks the code in ways that can have the same cumulative effect.

So you can see how LLVM could generate code that fetches rs.ptr before calling borrow_mut, which then grows the string, invalidating the pointer. The println! would then access freed memory, or at least print the wrong number of bytes.

I'm not sure LLVM would actually do something like this, if you somehow commented out the Rust borrow checker and tried it. But I wouldn't be surprised. Moving memory accesses around ("hoisting reads") is a real optimization that LLVM and other compilers really do. It's not even considered particularly fancy! And the Rust frontend might do similar optimizations before handing the code off to LLVM—I don't know.

That's a long answer to your question. The short answer is, there is no "when it's safe". It's never safe to break this rule, because it's not just for humans reasoning about their code. mut means "exclusive" to the compiler too.

Upvotes: 5

ShadowRanger
ShadowRanger

Reputation: 155403

It's for thread safety, to avoid data races. If two such mutable borrowings can exist, then two threads of execution can both attempt to modify the original data. If they do, all sorts of nasty race conditions can arise, e.g. if both threads try to append to the string:

  • The underlying array holding the data can get reallocated twice, with one of them leaked
  • The appended data could end up writing out of bounds due to time-of-check/time-of-use issues
  • You could end up with inconsistent definitions of the length and capacity
  • On some architectures and data sizes, tearing could mean a single logical value is read half as the old version and half as the updated value (producing something that could easily be unrelated to either the old or new value)
  • etc.

Borrows as a language feature mean that the function can temporarily hand off its unique mutable-ownership to some other function; while that other function holds the borrow, the original object can't be accessed through anything but that mutable borrow. It also means that for non-mutable borrows, it can prevent mutable borrows that might causes races between reads through the non-mutable borrow and writes through the mutable borrow. The borrow checker is preventing you from launching a thread that modifies s, then calling borrow_mut from the main thread, and the two threads producing garbage or crashing the program when they modify s simultaneously.

To be clear, with an advanced borrow-checker in some future version of Rust, this code could be made to work (the code you wrote does nothing inherently unsafe). But fully analyzing deep code paths to ensure nothing evil could possibly occur is hard, and it's relatively easy to impose stricter rules (which might be loosened in the future if they're sure it won't impose restrictions on the language design that bite them later). Your code would work just fine if you passed the single mutable borrow you already had into borrow_mut after all; your code is not made worse by doing things The Rust Way™.

Upvotes: 2

Related Questions