nowi
nowi

Reputation: 487

Compile-time Validated Type Erasure

Here's my goal: to erase type information to simplify object access. Here's a simple example of my goal:

class magic;

magic m = std::string("hello"); // ok: m now stores a string
m = 32;                         // error: m is supposed to be a string
m += " world";                  // ok: operator for this exists

You might've noticed that this basically functions like the auto keyword.

To continue, it'd also preferably not vary its size depending on type (e.g. use a pointer). That way I can use a container for it.

std::vector<magic> vec; // homogeneous
vec.emplace_back(8);
vec.emplace_back(std::string("str"));
vec[0] = 4; // ok
vec[1] = 2; // no way, jose. compile error here because vec[1] is a string

The idea is that it has to be compile-time (not runtime like with std::any or std::variant) because the types are known at compile-time anyway; it's just extra overhead I don't need.

The reason I know this is possible is because auto already does the job. I just need a container of some type that functions like auto* that actually validates operations at compile-time to save on overhead and very tedious redundant programming.

Here's how I sort of plan to use it (warning: bad pseudo code)

struct base
{
    auto* p;
};
struct child: base<int> // child implements base as an int
{
    // use p and implement whatever functions are necessary
};
std::vector<base> vec;
vec.emplace_back(child());
vec[0] = 20;

If you prefer, pretend it's a map instead of a vector if you worry about the "key" access being changed depending on what's pushed-back. But I have a hunch that the stl containers aren't going to work anyway, so feel free to post an answer that is a container that uses compile-time type erasure as I think that might be much easier than an independent type.

Upvotes: 1

Views: 962

Answers (2)

David Ledger
David Ledger

Reputation: 2101

Unfortunately, I cannot answer you question with the same syntax as in the question. Because as stated by others auto works differently to your assumption. auto is just a deduced type.

If it is assigned an int the type of the auto is int. However, this only applies when the type of auto is deduced. Any proceeding assignment is just assigning to an int, not to an auto. The type of auto is not dynamic and its storage is also not dynamic, this is why auto cannot be used to store various different types in a std::vector.

Just to add to the other answer, hopefully helping understanding:


auto i = 10;

The type of i here is int not auto.


auto b = true;

The type of i here is bool not auto.


However, I can do my best to solve what I believe is the problem your facing.


What this answer does:

  1. At compile time ensure that access to a variable is done through a function with correct parameter type (bypassing the need for checking type).

  2. Provide access to type erased data without exceptions (I think its safe...).

  3. Allow modification of the data.


What this doesn't do:

  1. Run at compile time, due to reinterpret case.
  2. Allow assignment directly through members in std::vector<>, although you can assign from within the called access function.

How it works:

A callback function with a typed parameter of T& is type erased and stored as a generic function. The storage for this function is void (*)() because function pointers are not the same as normal void * pointers, they often have different sizes.

The accessor function with the typed parameter is setup to be called by a function with two type erased pointer parameters. The parameters are converted to their real types within this function, the types are known because they were present on the constructor of the base object. A pointer to the function created within the constructor as a lambda is stored in the runner function pointer.

When the function access is run, the runner function with parameters data and the acessor function. Once the runner function is executed it internally executes the accessor function with the parameter data but this time after it is cast to the correct type.

When access is required a type erased version of the above function is called which internally calls the typed function. I can add support for lambdas in a later version of this but its pretty complicated already and I thought I would just post now...

Inside the base class a destructor class exists. This is a general way to store a type erased destructor, is almost the same as Herb Sutters method. This just makes sure that data given to the base has its destructor run.


A heap based approach is simpler conceptually you can run it here : https://godbolt.org/z/cb-a6m

A stack based approach is maybe faster by has more limitations : https://godbolt.org/z/vxS4tJ


The code heap based (simplier) code:

#include <iostream>
#include <memory>
#include <utility>
#include <vector>


template <typename T>
struct mirror { using type = T; };
template <typename T>
using mirror_t = typename mirror<T>::type;

struct destructor
{
    const void* p = nullptr;
    void(*destroy)(const void*) = nullptr;
    //
    template <typename T>
    destructor(T& data) noexcept :
        p{ std::addressof(data) },
        destroy{ [](const void* v) { static_cast<T const*>(v)->~T(); } }
    {}
    destructor(destructor&& d) noexcept
    {
        p = d.p;
        destroy = d.destroy;
        d.p = nullptr;
        d.destroy = nullptr;
    }
    destructor& operator=(destructor&& d) noexcept
    {
        p = d.p;
        destroy = d.destroy;
        d.p = nullptr;
        d.destroy = nullptr;
        return *this;
    }
    //
    destructor() = default;
    ~destructor()
    {
        if (p and destroy) destroy(p);
    }
};

struct base
{
    using void_ptr_t = void*;          // Correct size for a data pointer.
    using void_func_ptr_t = void(*)(); // Correct size for a function pointer.
    using callback_t = void (*)(void_func_ptr_t, void_ptr_t);
    //
    void_ptr_t data;
    void_func_ptr_t function;
    callback_t runner;
    destructor destruct;
    //
    template <typename T>
    constexpr base(T * value, void (*callback)(mirror_t<T>&)) noexcept :
        data{ static_cast<void_ptr_t>(value) },
        function{ reinterpret_cast<void_func_ptr_t>(callback) },
        runner{
            [](void_func_ptr_t f, void_ptr_t p) noexcept
            {
                using param = T&;
                using f_ptr = void (*)(param);
                reinterpret_cast<f_ptr>(f)(*static_cast<T*>(p));
            }
        },
        destruct{ *value }
    {}
    //
    constexpr void access() const noexcept
    {
        if (function and data) runner(function, data);
    }
};

struct custom_type
{
    custom_type()
    {
        std::cout << __func__ << "\n";
    }
    custom_type(custom_type const&)
    {
        std::cout << __func__ << "\n";
    }
    custom_type(custom_type &&)
    {
        std::cout << __func__ << "\n";
    }
    ~custom_type()
    {
        std::cout << __func__ << "\n";
    }
};
//
void int_access(int & a)
{
    std::cout << "int_access a = " << a << "\n";
    a = 11;
}
void string_access(std::string & a)
{
    std::cout << "string_access a = " << a << "\n";
    a = "I'm no longer a large string";
}
void custom_access(custom_type& a)
{

}

int main()
{
    std::vector<base> items;
    items.emplace_back(new std::string{ "hello this is a long string which doesn't just sit in small string optimisations, this needs to be tested in a tight loop to confirm no memory leaks are occuring." }, &string_access);
    items.emplace_back(new custom_type{},   &custom_access);
    items.emplace_back(new int (10),        &int_access);
    //
    for (auto& item : items)
    {
        item.access();
    }
    for (auto& item : items)
    {
        item.access();
    }
    //
    return 0;
}

Upvotes: 1

Nicol Bolas
Nicol Bolas

Reputation: 473252

Type erasure is a runtime concept. By definition, it cannot be validated at compile-time. If any such magic type could exist, there is no way that it could determine at compile-time that vec[0] = 4 is OK, while vec[1] = 2 isn't.

The reason I know this is possible is because auto already does the job.

No, it does not. auto is a grammatical construct that causes C++ to deduce the (compile-time determined) type of the variable based on the (compile-time determined) type of an expression. auto exists within the compiler, not the runtime.

What you want is something that happens at runtime. While the type of any particular vec[X] is determined at compile-time, the value of it is a runtime property. You want the value to somehow make an assignment a compile error or not. That is not possible.

This is why tuple uses get<X> rather than get(X). The index must be a compile-time constant, which allows the type of get<X> to potentially be different for each particular X in a tuple.

The properties of a type, like being assignable from an integer, are compile-time constructs. That is, either vec[X] = 4 is well-formed code or it isn't; it is impossible to make it sometimes be well-formed and sometimes not be, depending on X and the contents of the vec. You can make it UB, or throw an exception. But you can't make it a compile error.

Upvotes: 2

Related Questions