PasterOfMuppets
PasterOfMuppets

Reputation: 197

std::string_view complexity for constant strings

I am developing libraries with tons of static strings inside. To optimize the runtime a bit, this shall be changed from null-terminated strings (classic C-style char arrays) to pre-known-length structures, like std::string or std::string_view or custom pointer+length-pairs. std::string is well known and versatile but has the disadvantage of heavy footprint (at least 32bit on x64) plus VM overhead (heap and runtime cost) if the Small-String-Optimization is not applicable, which hits most of my data. string_view sounds like the best candidate, as long as I can be sure that data is null-terminated (despite of no such guarantee from string_view itself, the risk can be mitigated by conventions).

The questions which still matters much: do modern compilers pre-initialize string_view without using internal strlen, i.e. with constant complexity and not O^(n)? If they do, is the null-terminator preserved?

I could, of course, add some macro to initialize the string_views but that would be cumbersome. Something like:

#define foo(buf) foo, (sizeof(foo)-1)
const string_view svMyVar(foo("original C-string value"));

I hope that compiler would simplify that for me.

Upvotes: 4

Views: 3080

Answers (2)

eerorika
eerorika

Reputation: 238421

Although std::string_view(const char*) constructor has linear complexity, it is a constexpr constructor and a string literal is a compile time constant, and optimisers are often able to perform the linear complexity at compile time in practice, making the runtime constant. In constant evaluation contexts, this is guaranteed.

Note that your suggested macro, as well as the template suggested in the other answer behave differently from std::string_view(const char*) because string literals may contain null terminators, and the constructor only extends until the first terminator while the macro extends the entire literal.

This is particularly problematic if used with a non-string literal array that contains uninitialised elements that have garbage values:

const string_view svMyVar1("test\0test");
// svMyVar1.size() == 4

const string_view svMyVar2(foo("test\0test"));
// svMyVar2.size() == 9

char arr[32];
arr[0] = 'a';
arr[1] = '\0';
// arr now contains a null terminated "a" followed by 30 garbage chars
const string_view svMyVar3(foo(arr));
// svMyVar3.size() == 31, contains garbage

If they do, is the null-terminator preserved?

String view doesn't modify the array that it refers to. If the referred array contains a null terminator after the referred string, then the null terminator remains there. If there isn't a null terminator, then no null terminator is added.

Reading svMyVar[svMyVar.size()] still has undefined behaviour even if there is a null terminator outside the bounds of the view.

On the other hand, reading *(svMyVar.data() + svMyVar.size()) is fine if you know that there is a null terminator (or any other character) there. You cannot rely on that being the case in general with string view, but you can rely on it if the view is created from a string literal which is guaranteed to be null terminated.


without using internal strlen

Compilers are smart enough to calculate even strlen("literal") at compile time.

Technically, std::string_view uses Traits::length interanally, not std::strlen.

Upvotes: 3

Jarod42
Jarod42

Reputation: 218098

Unfortunately, std::string_view doesn't have constructor taking const char (&) [N] but only (related to const char*) const char* or const char*, std::size_t size. The former has to compute length, the later is given.

Instead of a MACRO, you might have function

template <std::size_t N>
std::string_view make_string_view(const char* (&s)[N]) { return {s, N - 1}; }

operator ""sv might even be simpler (size is also known/used at compile time to construct the string_view):

  • "hello world"sv

Note that you might include \0 if you want:

  • "hello world\0"sv

Upvotes: 4

Related Questions