Initializing a struct field-by-field. Is it possible to know if all the fields were initialized?

Question

I'm following the example from the official documentation. I'll copy the code here for simplicity:

#[derive(Debug, PartialEq)]
pub struct Foo {
    name: String,
    list: Vec,
}

let foo = {
    let mut uninit: MaybeUninit = MaybeUninit::uninit();
    let ptr = uninit.as_mut_ptr();

    // Initializing the `name` field
    // Using `write` instead of assignment via `=` to not call `drop` on the
    // old, uninitialized value.
    unsafe { addr_of_mut!((*ptr).name).write("Bob".to_string()); }

    // Initializing the `list` field
    // If there is a panic here, then the `String` in the `name` field leaks.
    unsafe { addr_of_mut!((*ptr).list).write(vec![0, 1, 2]); }

    // All the fields are initialized, so we call `assume_init` to get an initialized Foo.
    unsafe { uninit.assume_init() }
};

What bothers me is the second unsafe comment: If there is a panic here, then the String in the name field leaks. This is exactly what I want to avoid. I modified the example so now it reflects my concerns:

use std::mem::MaybeUninit;
use std::ptr::addr_of_mut;

#[derive(Debug, PartialEq)]
pub struct Foo {
    name: String,
    list: Vec,
}

#[allow(dead_code)]
fn main() {
    let mut uninit: MaybeUninit = MaybeUninit::uninit();
    let ptr = uninit.as_mut_ptr();
    
    init_foo(ptr);
    
    // this is wrong because it tries to read the uninitialized field
    // I could avoid this call if the function `init_foo` returns a `Result`
    // but I'd like to know which fields are initialized so I can cleanup 
    let _foo = unsafe { uninit.assume_init() };
}

fn init_foo(foo_ptr: *mut Foo) {
    unsafe { addr_of_mut!((*foo_ptr).name).write("Bob".to_string()); }
    
    // something happened and `list` field left uninitialized
    return;
}

The code builds and runs. But using MIRI I see the error:

Undefined Behavior: type validation failed at .value.list.buf.ptr.pointer: encountered uninitialized raw pointer

The question is how I can figure out which fields are initialized and which are not? Sure, I could return a result with the list of field names or similar, for example. But I don't want to do it - my struct can have dozens of fields, it changes over time and I'm too lazy to maintain an enum that should reflect the fields. Ideally I'd like to have something like this:

if addr_initialized!((*ptr).name) {
    clean(addr_of_mut!((*ptr).name));
}

Update: Here's an example of what I want to achieve. I'm doing some Vulkan programming (with ash crate, but that's not important). I want to create a struct that holds all the necessary objects, like Device, Instance, Surface, etc.:

struct VulkanData {
    pub instance: Instance,
    pub device: Device,
    pub surface: Surface,
    // 100500 other fields
}

fn init() -> Result {
    // let vulkan_data = VulkanData{}; // can't do that because some fields are not default constructible.

    let instance = create_instance(); // can fail
    let device = create_device(instance); // can fail, in this case instance have to be destroyed
    let surface = create_surface(device); // can fail, in this case instance and device have to be destroyed

    //other initialization routines

    VulkanData{instance, device, surface, ...}
}

As you can see, for every such object, there's a corresponding create_x function, which can fail. Obviously, if I fail in the middle of the process, I don't want to proceed. But I want to clear already created objects. As you mentioned, I could create a wrapper. But it's very tedious work to create wrappers for hundreds of types, I absolutely want to avoid this (btw, ash is already a wrapper over C-types). Moreover, because of the asynchronous nature of CPU-GPU communication, sometimes it makes no sense to drop an object, it can lead to errors. Instead, some form of a signal should come from the GPU that indicates that an object is safe to destroy. That's the main reason why I can't implement Drop for the wrappers.

But as soon as the struct is successfully initialized I know that it's safe to read any of its fields. That's why don't want to use an Option - it adds some overhead and makes no sense in my particular example.

All that is trivially achievable in C++ - create an uninitialized struct (well, by default all Vulkan objects are initialized with VK_NULL_HANDLE), start to fill it field-by-field, if something went wrong just destroy the objects that are not null.

kmdreko · Accepted Answer

There is no general purpose way to tell if something is initialized or not. Miri can detect this because it adds a lot of instrumentation and overhead to track memory operations.

All that is trivially achievable in C++ - create an uninitialized struct (well, by default all Vulkan objects are initialized with VK_NULL_HANDLE), start to fill it field-by-field, if something went wrong just destroy the objects that are not null.

You could theoretically do the same in Rust, however this is quite unsafe and makes a lot of assumptions about the construction of the ash types.

If the functions didn't depend on each other, I might suggest something like this:

let instance = create_instance();
let device = create_device();
let surface = create_surface();

match (instance, device, surface) {
    (Ok(instance), Ok(device), Ok(surface)) => {
        Ok(VulkanData{
            instance, 
            device, 
            surface,
        })
    }
    instance, device, surface {
        // clean up the `Ok` ones and return some error
    }
}

However, your functions are dependent on others succeeding (e.g. need the Instance to create a Device) and this also has the disadvantage that it would keep creating values when one already failed.

Creating wrappers with custom drop behavior is the most robust way to accomplish this. There is the vulkano crate that is built on top of ash that does this among other things. But if that's not to your liking you can use something like scopeguard to encapsulate drop logic on the fly.

use scopeguard::{guard, ScopeGuard}; // 1.1.0

fn init() -> Result {
    let instance = guard(create_instance()?, destroy_instance);
    let device = guard(create_device(&instance)?, destroy_device);
    let surface = guard(create_surface(&device)?, destroy_surface);

    Ok(VulkanData {
        // use `into_inner` to escape the drop behavior
        instance: ScopeGuard::into_inner(instance), 
        device: ScopeGuard::into_inner(device),
        surface: ScopeGuard::into_inner(surface),
    })
}

See a full example on the playground. No unsafe required.

Initializing a struct field-by-field. Is it possible to know if all the fields were initialized?

Answers (2)

Related Questions