Reputation: 189
I work on a Rust
library used, through C
headers, in a Swift
UI.
I can read from Swift in Rust, but I can't write right away to Swift (so from Rust) what I've just read.
--
Basically, I get to convert successfully in String
an *const i8
saying hello world
.
But the same String
fails to be handled with consistency by as_ptr()
(and so being parsed as UTF-8 in Swift) =>
Swift
send hello world
as *const i8
Rust
handle it through let input: &str
successfully (#1 print in get_message()
) => rightly prints hello world
input
&str
to a pointer again:Swift
"hello world".as_ptr()
)Basically, why
"hello world".as_ptr()
always have the same output and can be decoded by Swift- when
input.as_ptr()
has a different output every time called and can't never be decoded by Swift (where printinginput
rightly returnshello world
)?
Do you guys have ideas?
#[derive(Debug)]
#[repr(C)]
pub struct MessageC {
pub message_bytes: *const u8,
pub message_len: libc::size_t,
}
/// # Safety
/// call of c_string_safe from Swift
/// => https://doc.rust-lang.org/std/ffi/struct.CStr.html#method.from_ptr
unsafe fn c_string_safe(cstring: *const i8) -> String {
CStr::from_ptr(cstring).to_string_lossy().into_owned()
}
/// # Safety
/// call of c_string_safe from Swift
/// => https://doc.rust-lang.org/std/ffi/struct.CStr.html#method.from_ptr
/// on `async extern "C"` => <https://stackoverflow.com/a/52521592/7281870>
#[no_mangle]
#[tokio::main] // allow async function, needed to call here other async functions (not this example but needed)
pub async unsafe extern "C" fn get_message(
user_input: *const i8,
) -> MessageC {
let input: &str = &c_string_safe(user_input);
println!("from Swift: {}", input); // [consistent] from Swift: hello world
println!("converted to ptr: {:?}", input.as_ptr()); // [inconsistent] converted to ptr: 0x60000079d770 / converted to ptr: 0x6000007b40b0
println!("directly to ptr: {:?}", "hello world".as_ptr()); // [consistent] directly to ptr: 0x1028aaf6f
MessageC {
message_bytes: input.as_ptr(),
message_len: input.len() as libc::size_t,
}
}
Upvotes: 0
Views: 866
Reputation: 154946
The way you construct MessageC
is unsound and returns a dangling pointer. The code in get_message()
is equivalent to this:
pub async unsafe extern "C" fn get_message(user_input: *const i8) -> MessageC {
let _invisible = c_string_safe(user_input);
let input: &str = &_invisible;
// let's skip the prints
let msg = MessageC {
message_bytes: input.as_ptr(),
message_len: input.len() as libc::size_t,
};
drop(_invisible);
return msg;
}
Hopefully this formulation highlights the issue: c_string_safe()
returns an owned heap-allocated String
which gets dropped (and its data deallocated) by the end of the function. input
is a slice that refers to the data allocated by that String
. In safe Rust you wouldn't be allowed to return a slice referring to a local variable such as input
- you'd have to either return the String
itself or limit yourself to passing the slice downwards to functions.
However, you're not using safe Rust and you're creating a pointer to the heap-allocated data. Now you have a problem because as soon as get_message()
returns, the _invisible
String
gets deallocated, and the pointer you're returning is dangling. The dangling pointer may even appear to work because deallocation is not obligated to clear the data from memory, it just marks it as available for future allocations. But those future allocations can and will happen, perhaps from a different thread. Thus a program that references freed memory is bound to misbehave, often in an unpredictable fashion - which is precisely what you have observed.
In all-Rust code you'd resolve the issue by safely returning String
instead. But you're doing FFI, so you must reduce the string to a pointer/length pair. Rust allows you to do just that, the easiest way being to just call std::mem::forget()
to prevent the string from getting deallocated:
pub async unsafe extern "C" fn get_message(user_input: *const i8) -> MessageC {
let mut input = c_string_safe(user_input);
input.shrink_to_fit(); // ensure string capacity == len
let msg = MessageC {
message_bytes: input.as_ptr(),
message_len: input.len() as libc::size_t,
};
std::mem::forget(input); // prevent input's data from being deallocated on return
msg
}
But now you have a different problem: get_message()
allocates a string, but how do you deallocate it? Just dropping MessageC
won't do it because it just contains pointers. (And doing so by implementing Drop
would probably be unwise because you're sending it to Swift or whatever.) The solution is to provide a separate function that re-creates the String
from the MessageC
and drops it immediately:
pub unsafe fn free_message_c(m: MessageC) {
// The call to `shrink_to_fit()` above makes it sound to re-assemble
// the string using a capacity equal to its length
drop(String::from_raw_parts(
m.message_bytes as *mut _,
m.message_len,
m.message_len,
));
}
You should call this function once you're done with MessageC
, i.e. when the Swift code has done its job. (You could even make it extern "C"
and call it from Swift.)
Finally, using "hello world".as_ptr()
directly works because "hello world" is a static &str
which is baked into the executable and never gets deallocated. In other words, it doesn't point to a String
, it points to some static data that comes with the program.
Upvotes: 2