Max Vu
Max Vu

Reputation: 504

Can casts be more performant than tagged unions?

I'm designing a classical virtual machine that will operate on some kind of generic, type-switched value -- now represented with a tagged union:

typedef struct val {
    val_type type;
    union {
        int            i;
        unsigned int   u;
        double         f;
        str *        str;
        vec *        vec;
        map *        map;
    };
} val;

I've found a lot of literature about this online and am concluding that this is a pretty orthodox approach to the problem. I'm wondering, though, whether performance could possibly be improved with an approach like this:

typedef struct val_int {
    val_type type;
    int i;
};

typedef struct val_str {
    val_type type;
    char * buffer;
    size_t length;
    size_t capacity;
};

typedef struct val_vec {
    val_type type;
    val_type ** members; // <-- access member by cast
    size_t length;
    size_t capacity;
};

Here, I'm reasoning that there is a tradeoff between the cost of an extra indirection for accessing the primitive types (as well as needing to perform individual allocations - perhaps helped by pooling), held against the memory saved in collections like val_vec, by halving the size of the currently-fat pointer it represents. I know the easy answer here will be "measure it" but I'm having trouble coming up with an adequately-representative model that isn't a full implementation itself.

Is there a name for this second approach, and - assuming that this will be managed carefully (but not assuming it doesn't cause undefined behavior) is there a widely-understood risk that I'm not accounting for? Which approach is preferrable here?

Incidentally, could the use of flexible array members also be used similarly?

Upvotes: 0

Views: 289

Answers (2)

Chris Dodd
Chris Dodd

Reputation: 126448

In the latter case you probably also define

typedef union val {
    val_type           type;
    struct val_int     val_int;
    struct val_str     val_str;
    struct val_vec     val_vec;
} val;

and now you have a union type that can hold any value type. Indeed, this was the common way of doing this before anonymous unions and structs existed. When you have an object you know is only ever going to be a val_int, you can save a couple of bytes of memory by just allocating a val_int and casting the address of it to a val *.

Upvotes: 1

chux
chux

Reputation: 154169

I'm reasoning that there is a trade off between the cost of an extra indirection for accessing the primitive types

Premature optimization is rarely worth your valuable coding time. Any change here is at best linear. 6.001 vs 1/2 dozen of the other.

One way or another may be better in your select circumstance, yet the more info provided to the compiler, the more likely it will optimize better than you can here.

Code for clarity, which is usually avoiding casts.

The alternative lacks details and is likely problematic given the usual incorrect assumptions: void * or val_type * is a universal pointer, pointers to FP convert nicely to other pointers, alignments casted is sufficient for other types, anti-aliasing is not a concern.

Upvotes: 0

Related Questions