Dominic
Dominic

Reputation: 200

How can I free all structures that Rust allocated on top of an FFI buffer without freeing the buffer itself?

I have a Java program that calls out to Rust via JNA, providing the Rust side with a pointer to a potentially large (heap-allocated) buffer of continuously laid out, \0 terminated UTF-8 strings. The memory is owned by the Java side and freed when the garbage collector finalizes the associated object.

My goal is to process that buffer by interpreting it as a string vector, doing what I need to do, and dropping all structures that Rust allocated on top of the buffer, e.g. Vec's, Strings, etc. Due to the potential size of the buffer, I want to avoid copying the data around, if possible.

Consider the following code:

use std::ffi::CString;
use std::os::raw::c_char;

pub extern "C" fn process_data(data: *const c_char, num_elements: i64) {
    let mut vec: Vec<String> = Vec::with_capacity(num_elements as usize);
    let mut offset = 0;

    unsafe {
        for _ in 0..num_elements {
            let ptr = { data.offset(offset as isize) };

            // Main goal here is to have no memory copy involved
            let s = String::from_utf8_unchecked(CString::from_raw(ptr as *mut c_char).into_bytes());

            offset += s.len() + 1; // Include string termination
            vec.push(s);
        }
    }

    // do stuff with the vector
    // ...

    // Now that we're done, vec would be dropped, freeing the strings, thus freeing their underlying memory.
}

My understanding is that I now have a Vec that internally points to a buffer containing Strings that in turn internally point to Vecs, that then point in some way into the buffer I passed in.

If I let the code run like this without forgetting the vector explicitly, I get a double free because Java tries to deallocate the buffer, but Rust already did so by dropping the vector. Makes sense. However, forgetting the vector leaks all "management" structures on top of the buffer.

I thought about how I could deallocate everything that Rust allocated without leaking any memory. I thought about explicitly leaking boxes and dropping the pointers they give me (because Java still has a pointer) along the lines of:

fn forget_vec(vec: Vec<String>) {
    vec.into_iter().map(|s| {
        Box::into_raw(s.into_bytes().into_boxed_slice());
    }
}

However, since a slice is also a structure that contains a length and a pointer, and by doing the above I think I'd leak this structure. I was looking for something that consumes the slice and only returns me a pointer like *const u8.

I have a feeling that I'm generally going into the right direction, but I'm missing something major or have too little of an understanding of Rust to make it work outright.

Upvotes: 3

Views: 1377

Answers (1)

Shepmaster
Shepmaster

Reputation: 430574

Reread the documentation for CString, emphasis mine:

A type representing an owned, C-compatible, nul-terminated string with no nul bytes in the middle.

This type serves the purpose of being able to safely generate a C-compatible string from a Rust byte slice or vector.

You do not own these strings, Java does. Use &str and CStr instead:

use std::ffi::CStr;
use std::os::raw::c_char;

pub extern "C" fn process_data(data: *const c_char, num_elements: i64) {
    let mut vec: Vec<&str> = Vec::with_capacity(num_elements as usize);

    unsafe {
        let mut ptr = data;

        for _ in 0..num_elements {
            let s = CStr::from_ptr(ptr);
            ptr = ptr.add(s.to_bytes().len() + 1); // Include string termination

            if let Ok(s) = s.to_str() {
                vec.push(s);
            }
        }
    }
}

When your Vec is dropped, it just drops references and nothing is deallocated except the Vec itself.

Upvotes: 7

Related Questions