lobelk
lobelk

Reputation: 502

How to join compile-time string-like objects while keeping the API simple?

I am trying to concatenate string-like objects at compile-time. With the help of this post, I came up with something like this:

#include <cstddef>
#include <utility>
#include <algorithm>
#include <array>
#include <string_view>

template <std::size_t N>
class CharArray
{
  private:
    std::array<char, N> _string;

    template <std::size_t S>
    friend class CharArray;

    template <std::size_t S1, std::size_t S2>
    constexpr CharArray(const std::array<char, S1> s1, const std::array<char, S2> s2)
        : _string() {
        std::copy(s1.begin(), s1.end() - 1, _string.begin());
        std::copy(s2.begin(), s2.end() - 1, _string.begin() + S1 - 1);
    }

  public:
    constexpr CharArray(const char (&str)[N]) : _string() {
        std::copy(&str[0], &str[0] + N, _string.begin());
    }

    constexpr CharArray(const std::array<char, N>& str) : _string() {
        std::copy(std::cbegin(str), std::cend(str), _string.begin());
    }

    template <std::size_t S>
    constexpr auto operator+(const CharArray<S> other) const {
        return CharArray<N + other._string.size() - 1>(_string, other._string);
    }

    [[nodiscard]]
    constexpr auto c_str() const {
        return _string.data();
    }
};

template <std::size_t N, typename... Strings>
constexpr auto join_chars(const char (&first)[N], Strings&&... rest) {
    if constexpr (!sizeof...(Strings)) { return CharArray<N>{first}; }
    else {
        return CharArray<N>{first} + join_chars(std::forward<Strings>(rest)...);
    }
}

#include <iostream>

int main() {
    // this works;
    constexpr const char name[] = "Edward";
    constexpr auto joined1 = join_chars("name=", name);
    std::cout << joined1.c_str() << std::endl;

    // this does not work:
    // constexpr std::string_view value = "42"; // essentially same as constexpr const char*
    // constexpr auto joined2 = join_chars("value=", value);
    // std::cout << joined2.c_str() << std::endl;

    return 0;
}

However, this works only for string literals and char arrays. Is there a way to extend the functionality for other compile-time string-like objects?

EDIT: As suggested by @Oersted, one way to achieve this is by adding these two static functions to the CharArray class:

template <std::size_t N>
class CharArray
{
  //...
  public:
    static constexpr CharArray create(const char* c_ptr) {
        std::array<char, N> tmp_array;
        std::copy(c_ptr, c_ptr + N, tmp_array.begin());
        CharArray<N> char_array{tmp_array};
        return char_array;
    }

    template <typename T>
    requires std::is_constructible_v<std::string_view, T>
    static constexpr CharArray create(const T& str) {
        return CharArray::create(str.data());
    }
};

And then adding a macro and an overload:

template <typename T>
requires std::is_constructible_v<std::string_view, T>
constexpr std::size_t constexpr_strlen(const T& c_ptr) {
    return std::string_view{c_ptr}.length();
}

#define ConstexprChars(str) CharArray<constexpr_strlen(str) + 1>::create(str)

template <std::size_t N, typename... Strings>
constexpr auto join_chars(const CharArray<N>& first, Strings&&... rest) {
    if constexpr (!sizeof...(Strings)) { return first; }
    else { return first + join_chars(std::forward<Strings>(rest)...); }
}

Then, one could do:

int main() {
    constexpr const char* value = "42";
    constexpr auto joined2 = join_chars("value=", ConstexprChars(value));
    std::cout << joined2.c_str() << std::endl;

    return 0;
}

However, this changes API. Is it possible to achieve this, while retaining the same API? That is, is it possible to have this:

int main() {
    constexpr const char* value = "42";
    constexpr auto joined2 = join_chars("value=", value);
    std::cout << joined2.c_str() << std::endl;

    return 0;
}

EDIT 2: I have found a video of Jason Turner dealing with a similar problem (but he also added a static storage). He dealt with it by using a lambda as a constexpr function parameter (which is apparently allowed). So instead of having a class like mine CharArray, he used that lambda trick. This does not make the API any better since you have to write lambda wrapped around your char* so I guess that is the best one can get for now.

Upvotes: 2

Views: 343

Answers (3)

Oersted
Oersted

Reputation: 2776

Wrapping up my comments, I achieved this snippet. Does it fulfills your needs:

#include <algorithm>
#include <array>
#include <cstddef>
#include <iostream>
#include <string_view>

namespace Details {
template <size_t N>
struct StringWrapper {
    std::array<char, N> value;
    std::size_t size;

    constexpr explicit(false) StringWrapper(const char (&str)[N]) : size(N) {
        std::copy_n(static_cast<const char *>(str), N, value.data());
    }
};
}  // namespace Details

template <Details::StringWrapper head, Details::StringWrapper tail>
class SConcat {
    constexpr static std::array<char, head.size + tail.size - 1> init() {
        std::array<char, head.size + tail.size - 1> conc;
        std::copy(std::cbegin(head.value), std::cend(head.value) - 1,
                  std::begin(conc));
        std::copy(std::cbegin(tail.value), std::cend(tail.value),
                  std::begin(conc) + head.size - 1);
        return conc;
    }
    static constexpr std::array<char, head.size + tail.size - 1> data = init();

   public:
    static constexpr std::string_view as_sv() { return {data.data()}; }
};

int main() {
    constexpr char value[] = "42";
    const std::string_view conc{SConcat<"value=", value>::as_sv()};
    std::cout << conc << '\n';
    return static_cast<int>(conc[1] == 'a');
}

LIVE

It can be significantly polished I think but it gives the main ideas.

First StringWrapper is just a way to implicitly convert a raw string into an object that can be used as CNTTP (so C++20 is required) for SConcat.

Thus SConcat gives a unique type for each pair of strings used in its declaration and the concatenation result is held in a static member variable data that is providing the storage.

You can eventually get a string_view through the as_sv() method.

(NB1 by removing the std::cout part you can see how the compiler optimizes everything, the assembly is then:

main:
        mov     eax, 1
        ret

NB2 static_cast are just there to silence some false-positive warnings. )


EDIT: alternative that support various string literals declaration, as requested by op, a bit more verbose.

#include <algorithm>
#include <array>
#include <cstddef>
#include <string>
#include <string_view>

// #define PROVEOPTIM
#ifndef PROVEOPTIM
#include <iostream>
#endif

namespace Details {
constexpr std::size_t MySize(const char *str) {
    std::string tmp{str};
    return tmp.length();
}
template <std::size_t N>
constexpr auto MakeArray(const char *str) {
    std::array<char, N + 1> arr;
    std::copy_n(str, N + 1, arr.data());
    return arr;
}
}  // namespace Details

#define MAKEARRAY(str) Details::MakeArray<Details::MySize(str)>(str)

template <std::array head, std::array tail>
class SConcat {
    constexpr static std::array<char, head.size() + tail.size() - 1> init() {
        std::array<char, head.size() + tail.size() - 1> conc;
        std::copy(std::cbegin(head), std::cend(head) - 1, std::begin(conc));
        std::copy(std::cbegin(tail), std::cend(tail),
                  std::begin(conc) + head.size() - 1);
        return conc;
    }
    static constexpr std::array<char, head.size() + tail.size() - 1> data =
        init();

   public:
    static constexpr std::string_view as_sv() { return {data.data()}; }
};

int main() {
    constexpr const char *value = "42";
    constexpr const char value2[] = "3.14";
    const std::string_view conc{
        SConcat<MAKEARRAY("value="), MAKEARRAY(value)>::as_sv()};
    const std::string_view conc2{
        SConcat<MAKEARRAY("value="), MAKEARRAY(value2)>::as_sv()};
#ifndef PROVEOPTIM
    std::cout << conc << '\n';
    std::cout << conc2 << '\n';
#endif
    return static_cast<int>(conc[1] == 'a');
}

LIVE

It relies on building explicitly a std::array from a pointer to a null terminated string passed as const char* or const char[]. The macro MAKEARRAY merges two operations into one: getting the size as a compile-time constant and copying data from the same source used to get the size. Obviously, if the "strings" are not null-terminated the result is undefined.


EDIT with another layer of macro we can get even closer to the required API:

#define cconcat(lhs,rhs) SConcat<MAKEARRAY((lhs)),MAKEARRAY((rhs))>::as_sv()
...
    // another macro trick in order to match OP API
    constexpr auto joined1 = cconcat("value=", value);
    static constexpr std::string str{"constexpr "};
    constexpr auto joined2 = cconcat(str.c_str(),"string");
#ifndef PROVEOPTIM
    std::cout << conc << '\n';
    std::cout << conc2 << '\n';
    std::cout << joined1 << '\n';
    std::cout << joined2 << '\n';
#endif
...

LIVE again

I added another example using also a static constexpr std::string.

Yet I'm not sure it fully fulfils OP needs (see this answer about limitations with using string literals).


EDIT again, playing with macros again I can extend to a variable of arguments (but a bounded numbers).

// adding a method to SConcat
    static constexpr const char *as_cc() { return data.data(); }
...
// declaring a folding macro:
// Recursive Left Fold
#define FOR_EACH_L(fn, ...) __VA_OPT__(FOR_EACH_APPLY0(FOR_EACH_RESULT, FOR_EACH_L_4(fn,"",__VA_ARGS__)))

#define FOR_EACH_L_4(fn, res, ...) FOR_EACH_APPLY4(FOR_EACH_L_3, FOR_EACH_APPLY4(FOR_EACH_L_3, FOR_EACH_APPLY4(FOR_EACH_L_3, fn, res __VA_OPT__(, __VA_ARGS__))))
#define FOR_EACH_L_3(fn, res, ...) FOR_EACH_APPLY3(FOR_EACH_L_2, FOR_EACH_APPLY3(FOR_EACH_L_2, FOR_EACH_APPLY3(FOR_EACH_L_2, fn, res __VA_OPT__(, __VA_ARGS__))))
#define FOR_EACH_L_2(fn, res, ...) FOR_EACH_APPLY2(FOR_EACH_L_1, FOR_EACH_APPLY2(FOR_EACH_L_1, FOR_EACH_APPLY2(FOR_EACH_L_1, fn, res __VA_OPT__(, __VA_ARGS__))))
#define FOR_EACH_L_1(fn, res, ...) FOR_EACH_APPLY1(FOR_EACH_L_0, FOR_EACH_APPLY1(FOR_EACH_L_0, FOR_EACH_APPLY1(FOR_EACH_L_0, fn, res __VA_OPT__(, __VA_ARGS__))))
#define FOR_EACH_L_0(fn, res, ...) fn, FOR_EACH_FIRST(__VA_OPT__(fn(res, FOR_EACH_FIRST(__VA_ARGS__)), ) res) __VA_OPT__(FOR_EACH_TAIL(__VA_ARGS__))

#define FOR_EACH_APPLY4(fn, ...) fn(__VA_ARGS__)
#define FOR_EACH_APPLY3(fn, ...) fn(__VA_ARGS__)
#define FOR_EACH_APPLY2(fn, ...) fn(__VA_ARGS__)
#define FOR_EACH_APPLY1(fn, ...) fn(__VA_ARGS__)
#define FOR_EACH_APPLY0(fn, ...) fn(__VA_ARGS__)
#define FOR_EACH_FIRST(el, ...) el
#define FOR_EACH_TAIL(el, ...) __VA_OPT__(, __VA_ARGS__)
#define FOR_EACH_RESULT(fn, res, ...) res

#define MY_FUNC(nested, var) cconcatcc(nested,var)
#define vconcat(...) std::string_view{FOR_EACH_L(MY_FUNC, __VA_ARGS__)}
...
// usage
int main() {
    constexpr const char *value = "42";
    constexpr const char value2[] = "3.14";
    const std::string_view conc{
        SConcat<MAKEARRAY("value="), MAKEARRAY(value)>::as_sv()};
    const std::string_view conc2{
        SConcat<MAKEARRAY("value="), MAKEARRAY(value2)>::as_sv()};
    // another macro trick in order to match OP API
    constexpr auto joined1 = cconcat("value=", value);
    static constexpr std::string str{"constexpr "};
    constexpr auto joined2 = vconcat(str.c_str(), "string");
    constexpr const char *value3 = " variadic";
    constexpr auto joined3 = vconcat(str.c_str(), "string",value3);
#ifndef PROVEOPTIM
    std::cout << conc << '\n';
    std::cout << conc2 << '\n';
    std::cout << joined1 << '\n';
    std::cout << joined2 << '\n';
    std::cout << joined3 << '\n';
#endif
    return static_cast<int>(conc[1] == 'a');
}

LIVE

This folding macro is strongly inpired from this post. I had made a few adjustments to make it work with my own macro but it seems to give the expected result.

I'm feeling that the macro folding can be improved. I let others make a proposal in this way.

If you need more arguments, you'll need tooling (some external script) to generate the folding macro with enough layer. I think its quite a common practice with this kind of macros.

I don't think that C++, so far, can propose fully recursive/variadic macro (in the sense of a variadic function for instance). If I'm wrong, I'll be pleased to learn how it can be done.

Upvotes: 1

Enlico
Enlico

Reputation: 28510

As written by @NathanOliver in a comment under your question, what you ask is just not possible.¹

The example of desired code,

int main() {
    constexpr const char* value = "42";
    constexpr auto joined2 = cconcat("value=", value);
    std::cout << joined2.c_str() << std::endl;

    return 0;
}

is highly misleading, because it circumvents the whole problem by putting the callee, hence the arguments passed to the caller, and the caller in the same translation unit, thus making the compiler aware of things that it would otherwise not know.

Indeed, the only reason why the instantiation of cconcat succeeds, is that value is constexpr (and "value=" is a string literal), so Ns... can be all deduced by the compiler.

A simpler example is this

#include <cstddef>
template <std::size_t N>
constexpr auto foo(char const (&s)[N]) {
    return N;
}
int fun() {
    constexpr const char name[] = "Edward";
    static_assert(foo(name) == 7);
    return foo(name);
}

which compiles down to

fun():
        mov     eax, 7
        ret

But as soon as you make the input string come from another translation unit, like in this case,

#include <cstddef>
template <std::size_t N>
constexpr auto foo(char const (&s)[N]) {
    return N;
}
char const* getstring(); // defined in other TU
int fun() {
    foo(getstring()); // the foo above is a non-viable candidate
    return 0;
}

then you can't even compile, irrespective of whether the callee returns a compile time string (or even a string literal) via that char const*.

Clarification 1

I'm not saying that the direct cause of the code above not compiling is that getstring is defined in another TU.

The direct cause, as you point out in a comment, is clearly that getstring returns char const*, a by-pointer C-style string of unknown length, whereas foo accepts char const(&)[N], a reference to a C-style string of length required to be know at compile time (indeed N is determined by template type deduction).

But getstring is returning char const* precisely because it's defined in a TU that is mean to be linked against. The only way for another TU to return a C-style string, in a way that you can link it to a TU that calls foo(getstring());, is to have getstring return a C-style string by reference, which implies that it returns a string of know size! This, for instance, compiles

#include <cstddef>
template <std::size_t N>
constexpr auto foo(const char (&)[N]) {
    return N;
}

char const (&getstring())[5]; // defined in another TU

int fun() {
    foo(getstring());
    return 0;
}

but it is of little to no interest, imho, because it means that getstring can return only strings of a known length, 5 in the example.

Clarification 2

Since you seem to be positive about what I called the non-interesting case of caller and callee in the same TU, let me clarify what I meant by quoting myself:

it circumvents the whole problem by putting the callee, hence the arguments passed to the caller, and the caller in the same translation unit

Here I simplified a bit, by ascribing to "caller" both the call site and the "producer" of the strings that the caller passes to the callee. After all, you wrote these two lines together:

    constexpr const char* value = "42";
    constexpr auto joined2 = cconcat("value=", value);

The only way for cconcat to concatenate strings at compile time, is that those strings have to be know at compile time! How would you possible concatenate at compile time strings that will only be known at run time?

This, one way or another, means that you have all the strings in the same TU were cconcat is called and defined. They surely don't come from a call to char const*-returing function defined who knows where.

Furthermore, you can't let the constexpr-ness be lost across function boundaries. And remember, what Devid Stone presented is not a thing at the moment, so it doesn't matter how much you and I know that char const* s = "hello";i is written in the same place where cconcat is defined and called like cconcat("a literal", s): s length is not known at compile-time inside of cconcat. End of the story.

So you're back to your original solution, and that's it!

Clarification 3

As much as constexpr char [const]* is constexpr, the length of that string is not part of its type. E.g.

constexpr char const* s1{"hello"};
constexpr char const* s2{"hello world"};
static_assert(std::is_same_v<decltype(s1), decltype(s2)>);

compiles.

If a property of a value is not encoded in its type, the constexpr-ness of that value will be lost when that value is passed to a function as an argument, so, even in one TU,

consteval auto f(char const* s) {
  // the size of s is not known here!
}
int main() {
  constexpr char const* s{"hello"};
  foo(s);
}

Furthermore, given s1 and s2 have the same type as shown above, you should understand you're out of luck even if you try to pass the C-style string as a NTTP; as in, if you define

template<char const* s>
consteval auto foo() {
    int c{};
    while (s[c] != '\0') { c++; }
    return c;
};

the following will not compile!

int main() {
  constexpr char const* s{"hello"};
  foo<s>();
}

Notice that this compiles³:

constexpr char const s[]{"hello"};
int main() {
  foo<s>();
}

but the type of s is char const[6], so the size is part of the type! Indeed

constexpr char const s1[]{"hello"};
constexpr char const s2[]{"hello world"};
static_assert(std::is_same_v<decltype(s1), decltype(s2)>);

does not compile!


(¹) If you're truly asking for a solution to a problem where caller and callee are in the same TU, then the discussion is moot, imho, and the solution lies, in the worst case scenario, in a bit of template metaprogramming you can find for instance here. And whatever trick you'd need, you'd need it because C++ still has a long way to go; watch this talk by David Stone.

But as far as solving the problem in the real-world case of caller and callee in different TUs, there's no solution to your usecase, at least not made possible by the compiler², because by the time caller and callee come actually in contact, the compiler (actualy compilers, because two different ones could have been used for caller and callee) has long been sitting after finishing its job.

(²) Assuming only 2 TUs, you can imagine a very smart linker that would inspect the caller code's object file to work out what the lengths of all the strings passed to cconcat are, and then... basically modify the object code of the callee... which would essentially mean recompile it, I believe...

(³) Incidentally, I'm not sure why for the above to compile s is required to be static (so you need to write static if you move its definition inside the function scope, i.e. main in this case.

Upvotes: 2

Red.Wave
Red.Wave

Reputation: 4257

You need to store the concatenation result in a compile-time sized array(e.g. std::array). But std::string_view::size is part of its value(as opposed to its type), and cannot be used to determine the result type of concatenation function. Most straight forward approach would be to define a fixed_string class template and put the size in its the type metadata. The class template can declare conversion operators to std::array, std::string_view and std::string. For runtime concatenation you need to provide at least one instance of std::string class in the concatenation sequence. If operator+ is used to perform the concatenation, the syntax would remain consistent:

template<std::array val>
struct fixed_string{
    using this_arr_t = std::remove_cv_t<decltype(val)>;
    using value_type = this_arr_t::value_type;
    using string_view = std::basic_string_view<value_type>;

    consteval static auto size()
    {return size(val)-!back(val);};

    consteval operator std::basic_string_view<value_type>()
    const{return string_view{data(val), this->size()};};

    constexpr operator std::basic_string<value_type>()
    const{return std::basic_string{
                 string_view{*this}};};

    consteval static auto begin()
    {return std::begin(string_view{fixed_string{}});};

    consteval static auto end()
    {return std::end(string_view{fixed_string{}});};

    template<std::size_t N>
    consteval static auto arrcat(array<value_type, N> rhs)
    ->std::array<value_type, size() + N> {
           std::array<value_type, size() + N> res;
           std::ranges::copy(rhs,
           std::ranges::copy(fixed_string{}, begin(res)).second);
           return res;
    };

    template<std::array rhs>
    consteval fixed_string<arrcat(rhs)>
    operator+(fixed_string<rhs>)
    {return {};};
private:
    // I would like the class empty 
};

template<std::array val>
consteval fixed_string<val> operator""_fxstr()
{return {};};

auto str = "this"_fxstr + " is a "_fxstr + "std::string"s;

I have left a lot of implementation details for clarity. For example: I would remove the null terminator from the end, but one might want to keep it. But <ranges> provides a much easier way to achieve this with zero effort:

auto str = std::array{ "this"sv
                     , "is"sv
                     , "a"sv
                     , "std::string"s }
         | std::views::join("_")
         | std::ranges::to<std::string>();

Result of join can be constexpr if the inputs are. But it would not be a contiguous range before being collected as string or vector with ranges::to.

One final point - as already mentioned in comments - is to avoid defining identifiers with same names as those in well-known libraries like Qt: QObject is an irrelevant name to your intended use, and an abstract base class name in Qt.

Upvotes: 1

Related Questions