Interfector
Interfector

Reputation: 1898

How can I mutate a shared variable from multiple threads, disregarding data races?

How can I mutate the variable i inside the closure? Race conditions are considered to be acceptable.

use rayon::prelude::*;

fn main() {

    let mut i = 0;
    let mut closure = |_| {
        i = i + 1;
    };

    (0..100).into_par_iter().for_each(closure);
}

This code fails with:

error[E0525]: expected a closure that implements the `Fn` trait, but this closure only implements `FnMut`
  --> src\main.rs:6:23
   |
6  |     let mut closure = |_| {
   |                       ^^^ this closure implements `FnMut`, not `Fn`
7  |         i = i + 1;
   |         - closure is `FnMut` because it mutates the variable `i` here
...
10 |     (0..100).into_par_iter().for_each(closure);
   |                              -------- the requirement to implement `Fn` derives from here

Upvotes: 12

Views: 1507

Answers (4)

user4815162342
user4815162342

Reputation: 154866

The accepted answer explains the situation thoroughly - you definitely don't want data races in your code, because they are undefined behavior, and distinct from the more general "race conditions". Nor do you need data races to update shared data, there are better efficient ways to do that. But to satisfy curiosity, this answer attempts to answer the question as literally asked - if you were reckless enough to intentionally ignore data races and incur undefined behavior at your own peril, could you do it in unsafe Rust?

You indeed can. Code and discussion in this answer is provided for educational purposes, such as to check what kind of code the compiler generates. If code that intentionally incurs UB offends you, please stop reading here. You've been warned. :)

The obvious way to convince Rust to allow this data race is to create a raw pointer to mut i, send the pointer to the closure, and dereference it to mutate i. This dereference is unsafe because it leaves it to the programmer to ensure that no mutable references exist simultaneously, and that writes to the underlying data are synchronized with other accesses to it. While we can easily ensure the former by just not creating a reference, we obviously won't ensure the latter:

// Must wrap raw pointer in type that implements Sync.
struct Wrap(*mut i32);
unsafe impl Sync for Wrap {}

// Contains undefined behavior - don't use this!
fn main() {
    let mut i = 0;
    let i_ptr = Wrap(&mut i as *mut i32);
    let closure = |_| {
        unsafe { *i_ptr.0 = *i_ptr.0 + 1 }; // XXX: UB!
    };

    (0..100).into_par_iter().for_each(closure);
    println!("{}", i);
}

Playground

Note that pointers don't implement Sync or Send, so they require a wrapper to use them in threads. The wrapper unsafely implements Sync, but this unsafe is actually not UB - accessing to the pointer is safe, and there would be no UB if we, say, only printed it, or even dereferenced it for reading (as long as no one else writes to i). Writing to the dereferenced pointer is where we create UB, and that itself requires unsafe.

While this is the kind of code that the OP might have been after (it even prints 100 when run), it's of course still undefined behavior, and could break on a different hardware, or when upgraded to a different compiler. Making even a slight change to the code, such as using let i_ref = unsafe { &mut *i_ptr } to create a mutable reference and update it with *i_ref += 1 will make it change behavior.

In the context of C++11 Hans Boehm wrote an entire article on the danger of so-called "benign" data races, and why they cannot be allowed in the C++ memory model (which Rust shares).

Upvotes: 1

trent
trent

Reputation: 27895

There is a difference between a race condition and a data race.

A race condition is any situation when the outcome of two or more events depends on which one happens first, and nothing enforces a relative ordering between them. This can be fine, and as long as all possible orderings are acceptable, you may accept that your code has a race in it.

A data race is a specific kind of race condition where the events are unsynchronized accesses to the same memory and at least one of them is a mutation. Data races are undefined behavior. You cannot "accept" a data race because its existence invalidates the entire program; a program with an unavoidable data race in it does not have any defined behavior at all, so it does nothing useful.

Here's a version of your code that has a race condition, but not a data race:

    use std::sync::atomic::{AtomicI32, Ordering};
    let i = AtomicI32::new(0);
    let closure = |_| {
        i.store(i.load(Ordering::Relaxed) + 1, Ordering::Relaxed);
    };

    (0..100).into_par_iter().for_each(closure);

Because the loads and stores are not ordered with respect to the concurrently executing threads, there is no guarantee that the final value of i will be exactly 100. It could be 99, or 72, or 41, or even 1. This code has indeterminate, but defined behavior because although you don't know the exact order of events or the final outcome, you can still reason about its behavior. In this case, you can prove that the final value of i must be at least 1 and no greater than 100.

Note that in order to write this racy code, I still had to use AtomicI32 and atomic load and store. Not caring about the order of events in different threads doesn't free you from having to think about synchronizing memory access.

If your original code compiled, it would have a data race.¹ This means there are no guarantees about its behavior at all. So, assuming you actually accept data races, here's a version of your code that is consistent with what a compiler is allowed to do with it:

fn main() {}

Oh, right, undefined behavior must never occur. So this hypothetical compiler just deleted all your code because it is never allowed to run in the first place.

It's actually even worse than that. Suppose you had written something like this:

fn main() {
    let mut i = 0;
    let mut closure = |_| {
        i = i + 1;
    };

    (0..100).into_par_iter().for_each(closure);

    if i < 100 || i >= 100 {
        println!("this should always print");
    } else {
        println!("this should never print");
    }
}

What should this code print? If there are no data races, this code must emit the following:

this should always print

But if we allow data races, it might also print this:

this should never print

Or it could even print this:

this should never print
this should always print

If you think there is no way it could do the last thing, you are wrong. Undefined behavior in a program cannot be accepted, because it invalidates analysis even of correct code that has nothing obvious to do with the original error.

How likely is any of this to happen, if you just use unsafe and ignore the possibility of a data race? Well, probably not very likely, to be honest. If you use unsafe to bypass the checks and look at the generated assembly, it's likely to even be correct. But the only way to be sure is to write in assembly language directly, understand and code to the machine model: if you want to use Rust, you have to code to Rust's model, even if that means you lose a little performance.

How much performance? Probably not much, if anything. Atomic operations are very efficient and on many architectures, including the one you're probably using right now to read this, they actually are exactly as fast as non-atomic operations in cases like this. If you really want to know how much potential performance you lose, write both versions and benchmark them, or simply compare the assembly code with and without atomic operations.


¹ Technically, we can't say that a data race must occur, because it depends on whether any threads actually access i at the same time or not. If for_each decided for some reason to run all the closures on the same OS thread, for example, this code would not have a data race. But the fact that it may have a data race still poisons our analysis because we can't be sure it doesn't.

Upvotes: 9

Netwave
Netwave

Reputation: 42678

You cannot do that exactly, you need to ensure that some safe synchronisation happens in the under-layers for example. For example using an Arc + some kind of atomics operations. You have some examples in the documentation:

use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::thread;

let val = Arc::new(AtomicUsize::new(5));

for _ in 0..10 {
    let val = Arc::clone(&val);

    thread::spawn(move || {
        let v = val.fetch_add(1, Ordering::SeqCst);
        println!("{:?}", v);
    });
}

Playground

(as Adien4 points: there is no need of the Arc or the move in the second example- Rayon only requires the Fn to be Send + Sync) Which lead us to your example, that could be adapted as:

use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use rayon::prelude::*;

fn main() {

    let i = AtomicUsize::new(5);
    let mut closure = |_| {
        i.fetch_add(1, Ordering::SeqCst);
    };

    (0..100).into_par_iter().for_each(closure);
}

Playground

Upvotes: 6

jofel
jofel

Reputation: 3405

This is not possible as it would require parallel access to i which causes race conditions. You can try to use a Mutex to allow access from multiple threads.

Upvotes: 4

Related Questions