Is a clone statement ever optimised out?

Question

I have code similar to the following

let x = Arc::new(Mutex::new(Thing::new()));

work_on_data(x.clone());
do_more_work_on_data(x.clone());

x isn't used after the second function and therefore the second clone is not required. Should I remove the clone() manually, or is it optimised out?

Matthieu M. · Accepted Answer

Why not?

The over-arching principle of an optimizing compiler is the as-if rule, which specifies that anything can be optimized as long as the compiler can prove that the optimization is not observable.

Note: this comes on top of some languages allowing specific optimizations.

So for example:

#[derive(Clone, Debug)]
struct MyDummyType(u64);

extern {
    fn print_c(_: *const ());
}

#[inline(never)]
fn print(dummy: MyDummyType) {
    unsafe { print_c(&dummy as *const _ as *const _) }
}

fn main() {
    let x = MyDummyType(42);
    print(x.clone());
    print(x.clone());
}

Yields the following main:

; Function Attrs: nounwind uwtable
define internal void @_ZN8rust_out4main17h0c6f2596c7f28a79E() unnamed_addr #1 {
entry-block:
  tail call fastcc void @_ZN8rust_out5print17h1f2d1a86beea10d7E(i64 42)
  tail call fastcc void @_ZN8rust_out5print17h1f2d1a86beea10d7E(i64 42)
  ret void
}

The compiler completely saw through our code (and I actually had to use an extern function to force it to emit some code in main).

So, what about your case?

It's quite more difficult, to be honest.

Specifically, there's a potential change of semantics due to Drop:

with do_more_work_on_data(x.clone()), x is guaranteed to be dropped after the execution ends, and therefore any side-effect of Drop to be executed at the end of the current function,
with do_more_work_on_data(x), x may be dropped at the end of do_more_work_on_data OR it may be dropped earlier somewhere within.

So in order to prove that the optimization is not observable, the compiler has to prove:

either that Drop has no effect,
or that Drop will be executed at the very end of do_more_work_on_data, which is the same as right after it,
or ...?

How likely is this?

The Drop implementation of Mutex requires invoking FFI, so from the optimizer point of view it has observable effects.

So it all hinges on whether do_more_work_on_data gets inlined. If it does, yes indeed the extra clone could well be optimized out. If it does not, I would not hold my breath.

Is a clone statement ever optimised out?

Answers (2)

Related Questions