Kapichu
Kapichu

Reputation: 3706

Do I have to be concerned about the overhead of `Rc`?

Am I right to assume that the only thing that "slows down" Rcs is that it checks whether to deallocate the object when it drops? Besides that, "how much" is the overhead of dereferencing a Rc, i.e. should I be concerned about it?
Are those two functions almost equally fast? Or is there a notable difference in speed?

fn test_with_box() {
    let b = Box::new(1.0);
    let x = b * 2;
}

fn test_with_rc() {
    let rc = Rc::new(1.0);
    let x = rc * 2;
}

Since the referenced object in test_with_rc() always only has one reference and behaves like a Box in that function (viewed from outside, not internally, of course).

I suspect that Rcs are actually faster than I think.

PS: When talking about "fast" I mean both dereferencing and allocating/deallocating.

Upvotes: 17

Views: 4787

Answers (2)

Chris Morgan
Chris Morgan

Reputation: 90842

Rc<T> is very, very cheap. It’s not as cheap as T by quite a bit (boxing values is comparatively expensive in micro-optimisation terms), but scarcely less efficient than Box<T>.

It’s just like Box, but with an additional couple of words for the strong and weak reference counts, and the only things that need to touch that are creating an Rc (initialises the values), cloning an Rc (increments the refcount), dropping an Rc (decrements the refcount and runs the destructor if appropriate), and downgrading to/upgrading from Weak (increments one and decrements the other of the two refcounts).

Dereferencing is a simple memory operation just like it is with Box.

Upvotes: 18

llogiq
llogiq

Reputation: 14541

To answer your question, you can turn to the test crate, which contains a benchmarking solution (this needs a nightly):

#![feature(test)]
extern crate test;

use std::rc::Rc;

fn test_with_box() {
    let b = Box::new(1.0);
    test::black_box(*b * 2.0);
}

fn test_with_rc() {
    let rc = Rc::new(1.0);
    test::black_box(*rc * 2.0);
}

#[bench]
fn bench_box(b: &mut test::Bencher) {
    b.iter(test_with_box)
}

#[bench]
fn bench_rc(b: &mut test::Bencher) {
    b.iter(test_with_rc)
}

Now compiling & running this:

$ rustc --test -O rc.rs && ./rc --bench

running 2 tests
test bench_box ... bench:          22 ns/iter (+/- 0)
test bench_rc  ... bench:          22 ns/iter (+/- 0)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

What happened? The RC apparently was completely compiled out. As it should be, because we haven't cloned it. So changing the respective fn to:

fn test_with_rc() {
    let rc = Rc::new(1.0);
    test::black_box(*rc.clone() * 2.0);
}

We get the following:

running 2 tests
test bench_box ... bench:          23 ns/iter (+/- 1)
test bench_rc  ... bench:          25 ns/iter (+/- 1)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

So, suffice to say, you'll probably need to worry about other things before looking at RC-induced overhead.

Upvotes: 14

Related Questions