Konstantin W
Konstantin W

Reputation: 481

Why do I get a deadlock when using Tokio with a std::sync::Mutex?

I stumbled upon a deadlock condition when using Tokio:

use tokio::time::{delay_for, Duration};
use std::sync::Mutex;

#[tokio::main]
async fn main() {
    let mtx = Mutex::new(0);

    tokio::join!(work(&mtx), work(&mtx));

    println!("{}", *mtx.lock().unwrap());
}

async fn work(mtx: &Mutex<i32>) {
    println!("lock");
    {
        let mut v = mtx.lock().unwrap();
        println!("locked");
        // slow redis network request
        delay_for(Duration::from_millis(100)).await;
        *v += 1;
    }
    println!("unlock")
}

Produces the following output, then hangs forever.

lock
locked
lock

According to the Tokio docs, using std::sync::Mutex is ok:

Contrary to popular belief, it is ok and often preferred to use the ordinary Mutex from the standard library in asynchronous code.

However, replacing the Mutex with a tokio::sync::Mutex will not trigger the deadlock, and everything works "as intended", but only in the example case listed above. In a real world scenario, where the delay is caused by some Redis request, it will still fail.

I think it might be because I am actually not spawning threads at all, and therefore, even though executed "in parallel", I will lock on the same thread as await just yields execution.

What is the Rustacean way to achieve what I want without spawning a separate thread?

Upvotes: 8

Views: 4195

Answers (1)

Matthias247
Matthias247

Reputation: 10396

The reason why it is not OK to use a std::sync::Mutex here is that you hold it across the .await point. In this case:

  • task 1 holds the Mutex, but got suspended on delay_for.
  • task 2 gets scheduled and runs, but can not obtain the Mutex since its still owned by task 1. It will block synchronously on obtaining the Mutex.

Since task 2 is blocked, this also means the runtime thread is fully blocked. It can not actually go into its timer handling state (which happens when the runtime is idle and does not handle user tasks), and thereby can not resume task 1.

Therefore you now are observing a deadlock.

==> If you need to hold a Mutex across an .await point you have to use an async Mutex. Synchronous Mutexes are ok to use with async programs as the tokio documentation describes - but they may not be held across .await points.

Upvotes: 9

Related Questions