gonzo
gonzo

Reputation: 519

C++ make placement new aligned storage initializable using constexpr

This is a really important question to me as it's a bottleneck right now and I'm trying to investigate possible ways to solve my problem: I need to constexpr construct a std::function-like class that I am using that is quite simple. However, it's using aligned storage so that we can configure a pointer-sized number of captured elements. Let's call it Function.

https://github.com/fwsGonzo/libriscv/blob/master/lib/libriscv/util/function.hpp#L91

Specifically, I am using Function with up to 1 pointer captured. Usually "this". These functions are working wonderfully, and they will not compile if you try to capture too much.

The problem is that they have to be constructed at run-time, and there are so many of them that they are using around 3500 nanoseconds (3.5 micros), which is an eternity for my use case. I absolutely have to find a way to reduce this setup cost somehow, so the natural way to do that would be to investigate if I can construct them at compile-time.

I've been unable to do so and the compiler outright tells me that the constructor which uses placement new cannot be used in a constexpr context. This question tells the same story:

C++ constexpr in place aligned storage construction

You can see the problematic statement here: https://github.com/fwsGonzo/libriscv/blob/master/lib/libriscv/util/function.hpp#L148

template<typename Callable>
Function (Callable callable) noexcept
{
    static_assert(sizeof(Callable) <= FunctionStorageSize,
                  "Callable too large (greater than FunctionStorageSize)");
    static_assert(std::is_trivially_copy_constructible_v<Callable>,
                  "Callable not trivially copy constructible");
    static_assert(std::is_trivially_destructible_v<Callable>,
                  "Callable not trivially destructible");

    m_func_ptr = &trampoline<Callable>;

    new(reinterpret_cast<Callable *>(m_storage.data)) Callable(callable);
}

I am using C++20 and I am open to suggestions on how to solve this. Given that these functions have a constant-sized capture storage with a single function pointer, is it possible to construct these at compile time somehow? No heap allocations should result from this.

Upvotes: 1

Views: 675

Answers (4)

Bernd
Bernd

Reputation: 2221

I created my own type-erasing function, too. It is not constexpr because I need to use placement new or std::memcopy to fill my storage.

The main idea is to use a non capturing lambda for the "trampoline-generation", perhaps you can use it. The optimized, generated assembly looks really good in my eyes... godbolt

#include <iostream>
#include <cstring>

namespace Test
{
    template<typename Return, typename... Args>
    using InvokeFktPtr = Return(*)(const void*, Args...);

    template <
        typename Fkt
    >
    class SingleCastDelegate;

    template <
        typename ReturnType,
        typename... Args
    >
    class SingleCastDelegate<ReturnType(Args...)>
    {
    private:
        InvokeFktPtr<ReturnType, Args...> invokeFktPtr;
    private:
        static constexpr size_t max_lambda_size = 4 * sizeof(void*);
        std::byte storage[max_lambda_size];
    private:
        constexpr const void* GetData() const
        {
            return std::addressof(storage[0]);
        }
        constexpr void* GetData()
        {
            return std::addressof(storage[0]);
        }
    public:
        template<
            typename Lambda
            ,typename PureLambda = std::remove_reference_t<Lambda>
        >
        inline SingleCastDelegate(Lambda&& lambda)
        {
            constexpr auto lambdaSize = sizeof(PureLambda);
            static_assert(lambdaSize <= sizeof(void*) * 4);
            
            //add some static_asserts... (it must be trivial...)
            
            //placement new is not constexpr, or?
            new(std::addressof(storage)) PureLambda(lambda);

            invokeFktPtr = [](const void* data, Args... args)
            {
                const PureLambda& l = *static_cast<const PureLambda*>(data);
                return l(args...);
            };
        }

        template<
            typename... CustomArgs
        >
        using FktPtr = ReturnType(*)(CustomArgs...);

        template<
            typename... CustomArgs
            , typename = typename std::enable_if_t<std::is_invocable_v<FktPtr<Args...>, CustomArgs...>>
        >
        constexpr ReturnType operator()(CustomArgs&&... args) const
        {
            return invokeFktPtr(GetData(), std::forward<CustomArgs>(args)...);
        }
    };
}


int main()
{

    int i = 42;

    auto myFkt = [=](){
        std::cout << i;
    };
    auto myOtherFkt = [=](){
        std::cout << i * 2;
    };
    Test::SingleCastDelegate<void()> fkt = Test::SingleCastDelegate<void()>{ myFkt };
    fkt();

    fkt = myOtherFkt;
    fkt();

    return 0;
}

Upvotes: 1

gonzo
gonzo

Reputation: 519

I ended up finding a solution to this problem. Most of my function-like objects are just raw function pointers wrapped around this Function class, and so I ended up trying to make this portion constexpr, with success. This is not something that others could have answered because you can't think of everything when you write a question, and I just ended up having more information. Still, to anyone who tries to do this in the future: You will probably not be able to make lambda-with-capture constexpr, but you can still do what I did, as shown below.

By adding a new type that matches raw function pointers, and then catching that in an instantiated template like this:

template <>
constexpr Function<RawFunctionPointerType>(RawFunctionPointerType fptr) noexcept
    : m_func_ptr(&trampoline<RawFunctionPointerType>), m_real_ptr{fptr}  {}

The m_real_ptr member is in a union with the Storage:

union {
    RawFunctionPointerType m_real_ptr;
    Storage m_storage;
};

It was possible to constinit instantiate a std::array, which could be std::copy'd into my structure at runtime. By doing it this way I ended up saving at least 1 microsecond.

Upvotes: 0

Nicol Bolas
Nicol Bolas

Reputation: 474116

While C++20 does allow you to dynamically allocate memory in constexpr contexts, memory allocated at compile-time is not allowed to leak into runtime execution. So constexpr allocations must be statically bound to constant expression evaluation.

And even with C++20's features, you can't use placement new at compile time.

Upvotes: 1

Oliv
Oliv

Reputation: 18081

Using C++20 and if you increased the constraint on the type Callable to be trivially_copyable you could use bit_cast. You would also have to define a union containing a member of type aligned_storage <size, alignment> for all possible object size.

Unfortunately, I don't think there is a constexpr implementation of bit_cast yet.

A partial solution could be to declare a constexpr constructor if Callable designates a pointer to object type:

template<typename Callable>
constexpr
Function (Callable * callable) noexcept
   m_pointer {callable}
   m_func_ptr = &trampoline <Callable>
   {}

//declare the union
union {
    void * m_pointer;
    Storage m_storage;
    };

//end an overload trampoline specialized for pointer to object.

Upvotes: 1

Related Questions