Narann
Narann

Reputation: 878

Is an Option type smaller than the wrapped type plus a boolean?

use std::mem::size_of;

struct Position {
    x: f32,
    y: f32,
    z: f32,
}

struct PoolItem {
    entity_id: u32, // 4 bytes
    used: bool, // 1 bytes + 3 (padding)
    component: Position, // 12 bytes
}


assert_eq!(size_of::<u32>(), 4);
assert_eq!(size_of::<Position>(), 12);
assert_eq!(size_of::<PoolItem>(), 20);

As you can see, such a structure is 20 bytes long. Position is actually optional and depends on used.

Will the use of Option remove the need of the used field and decrease the structure size to 16?

struct PoolItem {
    entity_id: u32, // 4 bytes
    component: Option<Position>, // 12 bytes ?
}

If so, how is Option implemented for such a behavior to work?

My tests on Playground seem to indicate it doesn't work. Why?

Upvotes: 1

Views: 773

Answers (3)

Narann
Narann

Reputation: 878

As suggested in the comments, an alternative would be to use Option with NonZeroU32 for entity_id and rely on Some and None to check entity is used or not.

struct PoolItem {
    entity_id: Option<core::num::NonZeroU32>, // 4 bytes
    component: Position, // 12 bytes
}

fn main() {
    assert_eq!(size_of::<u32>(), 4);
    assert_eq!(size_of::<Position>(), 12);
    assert_eq!(size_of::<PoolItem>(), 16);
}

It makes entity ids starting from 1.

Playground

Upvotes: -2

Freyja
Freyja

Reputation: 40894

Option<Position> needs to store the state (Some or None) somewhere, and because Position already contains 12 bytes of information, you need more space to store it. Usually this means that it adds an extra byte (plus padding) to store the state, although in some cases where the inner type has a known unused state. For example, a reference can point to address 0, so Option<&'_ T> could use 0 as the None state and take up the same number of bytes as &'_ T. For your Position type, however, that's not the case.

If you absolutely need your PoolItem struct to be as small as possible, and if you can spare one bit from your entity_id field (say, the highest bit, 231), you can use that to store the state instead:

const COMPONENT_USED_BIT: u32 = (1u32 << 31);

struct PoolItem {
    entity_id: u32, // lowest 31 bits = entity ID, highest bit = "component used"
    component: Position,
}

This might become a bit complex, since you need to ensure that you're treating that bit specially, but you can write a couple of simple accessor methods to ensure that the special bit is dealt with correctly.

impl PoolItem {
    /// Get entity ID, without the "component used" bit
    fn entity_id(&self) -> u32 {
        self.entity_id & !COMPONENT_USED_BIT
    }

    /// Set entity ID, keeping the existing "component used" bit
    fn set_entity_id(&mut self, entity_id: u32) {
        let component_used_bit = self.entity_id & COMPONENT_USED_BIT;
        self.entity_id = (entity_id & !COMPONENT_USED_BIT) | component_used_bit;
    }

    /// Get component if "component used" bit is set
    fn component(&self) -> Option<&Position> {
        if self.entity_id & COMPONENT_USED_BIT != 0 {
            Some(&self.component)
        } else {
            None
        }
    }

    /// Set component, updating the "component used" bit
    fn set_component(&mut self, component: Option<Position>) {
        if let Some(component) = component {
            self.component = component;
            self.entity_id |= COMPONENT_USED_BIT;
        } else {
            self.entity_id &= !COMPONENT_USED_BIT;
        }
    }
}

Playground example with tests

Upvotes: 3

ShadowRanger
ShadowRanger

Reputation: 155536

The precise implementation of Option doesn't really matter. What's obvious is that you can't store X amount of data in X amount of storage and also store whether or not data is there at all. An obvious implementation for Option would be to store both object and a boolean indicating if the object exists; clearly something like that is happening. Option is a convenience, it still has to store the information somewhere.

Note that outside of a struct (which must have consistent size) Option might avoid this cost, if the optimizer determines the Option has known "populated or not" status at all times, so the boolean might be elided in favor of the code always using it deterministically in the correct way (either reading the object from the stack if it logically exists, or not doing so when it doesn't). But in this case, the extra data is needed.

Upvotes: 5

Related Questions