Danila Kiver
Danila Kiver

Reputation: 3758

Why do zero-sized types cause real allocations in some cases?

I was playing with zero-sized types (ZSTs) as I was curious about how they are actually implemented under the hood. Given that ZSTs do not require any space in memory and taking a raw pointer is a safe operation, I was interested what raw pointers I would get from different kinds of ZST "allocations" and how weird (for safe Rust) the results would be.

My first attempt (test_stk.rs) was to take const pointers to a few on-stack instances of ZSTs:

struct Empty;
struct EmptyAgain;

fn main() {
    let stk_ptr: *const Empty = &Empty;
    let stk_ptr_again: *const EmptyAgain = &EmptyAgain;
    let nested_stk_ptr = nested_stk();

    println!("Pointer to on-stack Empty:        {:?}", stk_ptr);
    println!("Pointer to on-stack EmptyAgain:   {:?}", stk_ptr_again);
    println!("Pointer to Empty in nested frame: {:?}", nested_stk_ptr);
}

fn nested_stk() -> *const Empty {
    &Empty
}

Compiling and running this produced the following result:

$ rustc test_stk.rs -o test_stk
$ ./test_stk 
Pointer to on-stack Empty:        0x55ab86fc6000
Pointer to on-stack EmptyAgain:   0x55ab86fc6000
Pointer to Empty in nested frame: 0x55ab86fc6000

A short analysis of the process memory map showed that 0x55ab86fc6000 was actually not a stack allocation, but the very beginning of the .rodata section. This seems logical: the compiler pretends that there is a single zero-sized value for each ZST, known at compile time, and each of these values resides in .rodata, as compile-time constants do.

The second attempt was with boxed ZSTs (test_box.rs):

struct Empty;
struct EmptyAgain;

fn main() {
    let ptr = Box::into_raw(Box::new(Empty));
    let ptr_again = Box::into_raw(Box::new(EmptyAgain));
    let nested_ptr = nested_box();

    println!("Pointer to boxed Empty:                 {:?}", ptr);
    println!("Pointer to boxed EmptyAgain:            {:?}", ptr_again);
    println!("Pointer to boxed Empty in nested frame: {:?}", nested_ptr);
}

fn nested_box() -> *mut Empty {
    Box::into_raw(Box::new(Empty))
}

Running this snippet gave:

$ rustc test_box.rs -o test_box
$ ./test_box 
Pointer to boxed Empty:                 0x1
Pointer to boxed EmptyAgain:            0x1
Pointer to boxed Empty in nested frame: 0x1

Quick debugging showed that this is how the allocator works for ZSTs (Rust's liballoc/alloc.rs):

unsafe fn exchange_malloc(size: usize, align: usize) -> *mut u8 {
    if size == 0 {
        align as *mut u8
    } else {
        // ...
    }
}

The minimum possible alignment is 1 (as per the Nomicon), so for ZSTs the box operator calls exchange_malloc(0, 1) and the resulting address is 0x1.

After noticing that into_raw() returns a mutable pointer, I decided to retry the previous test (on-stack) with mutable pointers (test_stk_mut.rs):

struct Empty;
struct EmptyAgain;

fn main() {
    let stk_ptr: *mut Empty = &mut Empty;
    let stk_ptr_again: *mut EmptyAgain = &mut EmptyAgain;
    let nested_stk_ptr = nested_stk();

    println!("Pointer to on-stack Empty:        {:?}", stk_ptr);
    println!("Pointer to on-stack EmptyAgain:   {:?}", stk_ptr_again);
    println!("Pointer to Empty in nested frame: {:?}", nested_stk_ptr);
}

fn nested_stk() -> *mut Empty {
    &mut Empty
}

And running this printed the following:

$ rustc test_stk_mut.rs -o test_stk_mut
$ ./test_stk_mut 
Pointer to on-stack Empty:        0x7ffc3817b5e0
Pointer to on-stack EmptyAgain:   0x7ffc3817b5f0
Pointer to Empty in nested frame: 0x7ffc3817b580

It turns out that this time I had real stack-allocated values, each having its own address! When I tried to declare them sequentially (test_stk_seq.rs), I discovered that each of these values occupied eight bytes:

struct Empty;

fn main() {
    let mut stk1 = Empty;
    let mut stk2 = Empty;
    let mut stk3 = Empty;
    let mut stk4 = Empty;
    let mut stk5 = Empty;

    let stk_ptr1: *mut Empty = &mut stk1;
    let stk_ptr2: *mut Empty = &mut stk2;
    let stk_ptr3: *mut Empty = &mut stk3;
    let stk_ptr4: *mut Empty = &mut stk4;
    let stk_ptr5: *mut Empty = &mut stk5;

    println!("Pointer to on-stack Empty: {:?}", stk_ptr1);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr2);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr3);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr4);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr5);
}

Run:

$ rustc test_stk_seq.rs -o test_stk_seq
$ ./test_stk_seq 
Pointer to on-stack Empty: 0x7ffdba303840
Pointer to on-stack Empty: 0x7ffdba303848
Pointer to on-stack Empty: 0x7ffdba303850
Pointer to on-stack Empty: 0x7ffdba303858
Pointer to on-stack Empty: 0x7ffdba303860

So, here are the things I cannot understand:

  1. Why do boxed ZST allocations use the dumb 0x1 address instead of something more meaningful, like in case of "on-stack" values?

  2. Why is there need to allocate real space for on-stack ZST values when there are mutable raw pointers to them?

  3. Why are exactly eight bytes used for mutable on-stack allocations? Should I treat this size as "0 bytes of actual type size + 8 bytes of alignment"?

Upvotes: 55

Views: 2278

Answers (1)

Chayim Friedman
Chayim Friedman

Reputation: 71330

Important note: Remember that almost none of the following is guaranteed. It is just how it works right now.


Why do boxed ZST allocations use the dumb 0x1 address instead of something more meaningful, like in case of "on-stack" values?

No address is meaningful for ZSTs. The compiler just use the easiest approach. In particular, both the address on stack for mutable pointers and the address in .rodata for shared is not something special of ZST, but a general property of any type, as I will explain in a minute. In contrary, Box needs to handle ZSTs specially. It does that by the easiest possible way - returning the first possible fake address.

Why is there need to allocate real space for on-stack ZST values when there are mutable raw pointers to them?

The question is not why we need to allocate real stack space for ZSTs, the question is why not. Every variable and temporary get allocated on the stack. There is no reason to special-case ZSTs.

If you will come and ask, "but I saw that shared reference they are allocated in .rodata!", try the following:

struct Empty;
struct EmptyAgain;

fn main() {
    let empty = Empty;
    let empty_again = EmptyAgain;
    let stk_ptr: *const Empty = ∅
    let stk_ptr_again: *const EmptyAgain = &empty_again;
    let nested_stk_ptr = nested_stk();

    println!("Pointer to on-stack Empty:        {:?}", stk_ptr);
    println!("Pointer to on-stack EmptyAgain:   {:?}", stk_ptr_again);
    println!("Pointer to Empty in nested frame: {:?}", nested_stk_ptr);
}

fn nested_stk() -> *const Empty {
    let empty = Empty;
    &empty
}

You can see that they are allocated on the stack.

And if you will ask "but still, when taking the address in the same statement (let stk_ptr = &Empty;) it gives an address on .rodata for shared reference and on the stack for mutable!" the answer will be that the mutable case is a normal case, and shared reference are special-cased due to static promotion. What that means is that contrary to the normal case, with mutable references and function calls and other things, where the following:

let v1 = &mut Foo;

let v2 = &foo();

Is translated into:

let mut __v1_storage = Foo;
let v1 = &mut __v1_storage;

let __v2_storage = foo();
let v2 = &__v2_storage;

With some expressions, in particular struct literals, the translation is different:

let v = &Foo { ... };

// Translated into:

static __V_STORAGE: Foo = Foo { ... };
let v = &__V_STORAGE;

And as statics, it is stored in .rodata, ZST or not.

Why are exactly eight bytes used for mutable on-stack allocations? Should I treat this size as "0 bytes of actual type size + 8 bytes of alignment"?

More like "1 byte of actual size + 7 bytes padding for alignment". But in Rust, the size of ZSTs is (obviously) zero and the (default) alignment is one, so what happens here?

Well, rustc lowers ZSTs into an empty LLVM struct (%Empty = type { }). Structs in LLVM use the maximum of the specified alignment (in the instructions handling them) and the target's preferred alignment. The preferred alignment of x86-64 is 8 bytes, so max(1, 8) = 8.

Regarding the size, LLVM does not handle zero-sized stack allocations. When an empty struct is being allocad, LLVM rounds it up to size of one. So we got size of one, alignment of 8, we pad for a mutliple of the alignment - 8 bytes for each allocation.

If you will try with e.g. struct Empty(u8); or struct Empty(u8, u8); you will see it is use stack space of 1 or 2, respectively, and not 8. This is because these structs (Scalar and ScalarPair layouts, as they are called in rustc) are not represented as LLVM structs but as LLVM primitives: i8 and { i8, i8 }. Those do not use the preferred alignment. But if you will use three fields, you will see it is also 8 bytes wide:

struct Empty(u8, u8, u8);

fn main() {
    let mut stk1 = Empty(0, 0, 0);
    let mut stk2 = Empty(0, 0, 0);
    let mut stk3 = Empty(0, 0, 0);
    let mut stk4 = Empty(0, 0, 0);
    let mut stk5 = Empty(0, 0, 0);

    let stk_ptr1: *mut Empty = &mut stk1;
    let stk_ptr2: *mut Empty = &mut stk2;
    let stk_ptr3: *mut Empty = &mut stk3;
    let stk_ptr4: *mut Empty = &mut stk4;
    let stk_ptr5: *mut Empty = &mut stk5;

    println!("Pointer to on-stack Empty: {:?}", stk_ptr1);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr2);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr3);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr4);
    println!("Pointer to on-stack Empty: {:?}", stk_ptr5);
}

Upvotes: 3

Related Questions