Lily Mara
Lily Mara

Reputation: 4138

Reuse vector heap space for owned Cows without transmute?

I am trying to optimize a Rust program through reduced allocations. I have many borrowed string slices that sometimes need to be turned into owned string slices. There is a function that takes in two Vecs of Cow<str> values, one of them 'static (V1) and one '_ (V2). I need to clear V2, then read the 'static values from V1 and copy them into V2, returning a Vec<Cow<'static, str>>. This is problematic because the compiler thinks the lifetime of V2 is '_ even though at runtime it will only hold values that are 'static. I have presently made it work by using transmute, but transmuting to the static lifetime is a major code smell IMO. Here is a minimum example of what I'm trying to accomplish:

use std::borrow::Cow;

fn drain(owned: Vec<Cow<'static, str>>, mut borrowed: Vec<Cow<'_, str>>) -> Vec<Cow<'static, str>> {
    borrowed.clear();
    let mut out = unsafe {
        fn assert_static<T: 'static>(_x: &T) {}
        assert_static(&owned);

        std::mem::transmute::<Vec<Cow<'_, str>>, Vec<Cow<'static, str>>>(borrowed)
    };

    for v in owned {
        out.push(v);
    }

    out
}

Is it possible to write this function without any additional allocations and without using transmute?

Here are some alternatives that I've considered:

Upvotes: 3

Views: 241

Answers (3)

Lily Mara
Lily Mara

Reputation: 4138

I have decided to continue using transmute here, but I did isolate it to a single function which can be used in different contexts:

use std::borrow::Cow;

fn clear_and_reinterpret<'a>(mut data: Vec<Cow<'_, str>>) -> Vec<Cow<'a, str>> {
    data.clear();

    unsafe { std::mem::transmute::<Vec<Cow<'_, str>>, Vec<Cow<'a, str>>>(data) }
}

I know that the turbofish on transmute is technically not required but it is best practice to include it due to how dangerous transmute is if used incorrectly.

Upvotes: 1

Coder-256
Coder-256

Reputation: 5618

Try using Vec::into_raw_parts():

#![feature(vec_into_raw_parts)]

use std::borrow::Cow;

fn drain(owned: Vec<Cow<'static, str>>, mut borrowed: Vec<Cow<'_, str>>) -> Vec<Cow<'static, str>> {
    borrowed.clear();
    let (ptr, len, cap) = borrowed.into_raw_parts();
    let ptr = ptr as *mut u8 as *mut Cow<'static, str>;
    let mut borrowed: Vec<Cow<'static, str>> = unsafe { Vec::from_raw_parts(ptr, len, cap) };
    borrowed.extend_from_slice(&owned);
    borrowed
}

(I'm not sure why the intermediate cast to *mut u8 seems to be necessary, omitting it leads to what seems like a spurious error, E0621).


Edit: Vec::into_raw_parts() is unstable, but its source code is simply:

pub fn into_raw_parts(self) -> (*mut T, usize, usize) {
    let mut me = ManuallyDrop::new(self);
    (me.as_mut_ptr(), me.len(), me.capacity())
}

Therefore, on stable, you can do:

use std::borrow::Cow;
use std::mem::ManuallyDrop;

fn drain(owned: Vec<Cow<'static, str>>, mut borrowed: Vec<Cow<'_, str>>) -> Vec<Cow<'static, str>> {
    borrowed.clear();
    let mut borrowed = ManuallyDrop::new(borrowed);
    let (ptr, len, cap) = (borrowed.as_mut_ptr(), borrowed.len(), borrowed.capacity());
    let ptr = ptr as *mut u8 as *mut Cow<'static, str>;
    let mut borrowed: Vec<Cow<'static, str>> = unsafe { Vec::from_raw_parts(ptr, len, cap) };
    borrowed.extend_from_slice(&owned);
    borrowed
}

Upvotes: 2

Sven Marnach
Sven Marnach

Reputation: 602115

I think the main question is "Do you really need this optimization?"

The difference will only matter if the function is called many times in a tight loop. If you don't have evidence that it matters, I suggest simply going with owned.clone() and be done with it. If you do have evidence that this is a performance bottleneck, using transmute() is fine and the right solution. The two types you are transmuting between are identical except for the lifetime, so there can't be any difference in memory layout, and it's easy to see that the new lifetime is fine. At the time of the transmutation the vector is empty anyway, and the new elements are added in safe code again.

As a side note, you don't really need assert_static(). The loop at the end of the function will fail if the item type of owned isn't static.

Upvotes: 3

Related Questions