quentinadam
quentinadam

Reputation: 3150

How to pass a struct with Rc fields between threads (without simultaneous access)

I have the following (simplified) code that spawns a few threads that do long and complex operations to build a Transactions struct. This Transactions struct contains fields with Rc. At the end of the threads I want to return the computed Transactions struct to the calling thread through an mpsc::channel.

use std::thread;
use std::collections::HashMap;
use std::sync::mpsc::{channel, Sender};
use std::rc::Rc;

#[derive(Debug)]
struct Transaction {
  id: String,
}

#[derive(Debug)]
struct Transactions {
  list: Vec<Rc<Transaction>>,
  index: HashMap<String, Rc<Transaction>>,
}

fn main() {
  
  let (tx, rx) = channel();
  
  for _ in 0..4 {
    tx = Sender::clone(&tx);
    thread::spawn(move || {
      // complex and long computation to build a Transactions struct
      let transactions = Transactions { list: Vec::new(), index: HashMap::new() };
      tx.send(transactions).unwrap();
    });
  }
  
  drop(tx);

  for transactions in rx {
    println!("Got: {:?}", transactions);
  }

}

The compiler complains that std::rc::Rc<Transaction> cannot be sent safely between threads because it does not implement the std::marker::Send trait.

error[E0277]: `std::rc::Rc<Transaction>` cannot be sent between threads safely
   --> src/main.rs:23:5
    |
23  |     thread::spawn(move || {
    |     ^^^^^^^^^^^^^ `std::rc::Rc<Transaction>` cannot be sent between threads safely
    |
    = help: the trait `std::marker::Send` is not implemented for `std::rc::Rc<Transaction>`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::ptr::Unique<std::rc::Rc<Transaction>>`
    = note: required because it appears within the type `alloc::raw_vec::RawVec<std::rc::Rc<Transaction>>`
    = note: required because it appears within the type `std::vec::Vec<std::rc::Rc<Transaction>>`
    = note: required because it appears within the type `Transactions`
    = note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::mpsc::Sender<Transactions>`
    = note: required because it appears within the type `[closure@src/main.rs:23:19: 27:6 tx:std::sync::mpsc::Sender<Transactions>]`

I understand the I could replace Rc by Arc, but was looking to know if there was any other solution to avoid the performance penalty of using Arc, because the Rc structs are never accessed by two threads at the same time.

Upvotes: 5

Views: 1849

Answers (2)

Mihir Luthra
Mihir Luthra

Reputation: 6779

Unfortunately I can't delete this post as this is the accepted answer but I want to point to this link that I missed before:

Is it safe to `Send` struct containing `Rc` if strong_count is 1 and weak_count is 0?

Since Rc is not Send, its implementation can be optimized in a variety of ways. The entire memory could be allocated using a thread-local arena. The counters could be allocated using a thread-local arena, separately, so as to seamlessly convert to/from Box…. This is not the case at the moment, AFAIK, however the API allows it, so the next release could definitely take advantage of this.


Old Answer

As you don't want to use Arc, you could use the new type pattern and wrap Rc inside a type that implements Send and Sync. These traits are unsafe to implement and after doing so it's all upto you to ensure that you don't cause undefined behaviour.


Wrapper around Rc would look like:

#[derive(Debug)]
struct RcWrapper<T> {
    rc: Rc<T>
}

impl<T> Deref for RcWrapper<T> {
    type Target = Rc<T>;

    fn deref(&self) -> &Self::Target {
        &self.rc
    }
}

unsafe impl<T: Send> Send for RcWrapper<T> {}
unsafe impl<T: Sync> Sync for RcWrapper<T> {}

Then,

#[derive(Debug)]
struct Transactions {
    list: Vec<RcWrapper<Transaction>>,
    index: HashMap<String, RcWrapper<Transaction>>,
}

Playground

Although, Deref trait is not very much worth in this case as most functions are associated. Generally Rc is cloned as Rc::clone() but you can still use the equivalentrc.clone() (probably the only case where Deref might be worth). For a workaround, you could have wrapper methods to call Rc's methods for clarity.

Update:

Found send_wrapper crate which seems to serve that purpose.

You could use it like:

use send_wrapper::SendWrapper;

#[derive(Debug)]
struct Transactions {
    list: Vec<SendWrapper<Rc<Transaction>>>,
    index: HashMap<String, SendWrapper<Rc<Transaction>>>,
}

PS: I would suggest to stick with Arc. The overhead is generally not that high unless you make alot of clones frequently. I am not sure how Rc is implemented. Send allows type to be sent into other threads and if there is anything thread-local, such as thread local locks or data, I am not sure how that would be handled.

Upvotes: 0

RedBorg
RedBorg

Reputation: 145

Just DO NOT do it.

I was making this a comment, but I think it actually answers your question and warns others.

This seems very unsound! Rc is not about managing access, it’s about making sure something lives long enough to be shared between different “owners”/“borrowers” by counting how many references are alive. If there are two (Rc) references to the same value in two different threads, the lack of atomicity could cause two threads to change the reference count AT THE SAME TIME, which could lead to the record being smudged, which could cause memory leaks, or worse, prematurely dropping the allocation and UB.

This is because of the classic sync problem of incrementing a shared variable:

Steps of incrementing a variable:

  1. Read variable and store it in the stack.
  2. Add 1 to the copy in the stack.
  3. Write back the result in the variable

That’s all fine with one thread, but let’s see what could happen otherwise:

Multithreaded sync incident (Threads A & B)

  1. x=0
  2. A: read x into xa (stack), xa = 0
  3. B: read x into xb, xb =0
  4. A: increment xa, xa = 1
  5. A: write xa to x, x =1
  6. B: increment xb, xb = 1
  7. B: write xb to x, x = 1
  8. x is now 1

You have now incremented 0 twice, with the result being 1: BAD!

If x was the reference count of an Rc, it would think only one reference is alive. If one reference is dropped, it will think there’s no more reference alive and will drop the value, but there’s actually still a reference out there that thinks it’s ok to access the data, therefore Undefined Behaviour, therefore VERY BAD!

The performance cost of Arc is negligible compared to everything else, it’s absolutely worth using.

Upvotes: 1

Related Questions