string not optimized enough for string literals

In a C++ project of mine I'm one step before replacing all char* with std::string but I find one certain occasion where std::string fails miserably.

Imagine I have these 2 functions:

void foo1(const std::string& s)
{
    ...
}

void foo2(const char* s)
{
    ...
}

If I write something like this:

const char* SL = "Hello to all!";

foo1(SL); // calls malloc, memcpy, free
foo2(SL);

in foo1 the SL will implicitly converted into std::string. This means that the std::string constructor will allocate memory and it will copy the string literal to that buffer. In foo2 though nothing of all these will happen.

In most implementations std::string is supposed to be super optimized (Copy On Write for instance) but when I construct it with a const char* it is not. And my question is this: Why this happens? Am I missing something? Is my standard library not optimized enough or for some reason (that I'm not aware of) this is totally unsafe?

Upvotes: 13

Views: 1730

Answers (3)

Matthieu M.
Matthieu M.

Reputation: 299850

Actually, your worries would go away(*) if you changed the literal:

std::string const SL = "Hello to all!";

I added the const for you.

Now, calling foo1 will not involve any copying (at all), and calling foo2 can be achieved at little cost:

foo1(SL);         // by const-reference, exact same cost than a pointer
foo2(SL.c_str()); // simple pointer

If you want to move to std::string, don't only switch the functions interfaces, switch the variables (and constants) too.

(*) The original answer assumed that SL was a global constant, if it is a variable local to a function, then it could be made static if one truly wishes to avoid building it at each call.

Upvotes: 21

Stack Overflow is garbage
Stack Overflow is garbage

Reputation: 247969

std::string isn't a silver bullet. It's intended to be the best possible implementation of a general-purpose mutable string which owns its memory, and which is reasonably cheap to use with C APIs. Those are common scenarios, but they don't match every instance of string usage.

String literals, as you mention, do not fit this use case well. They use statically allocated memory, so std::string can't and shouldn't try to take ownership of the memory. And these strings are always read-only, so std::string can't allow you to modify them.

std::string creates a copy of the string data passed to it, and then works on this copy internally.

If you want to operate on constant strings whose lifetime is handled elsewhere (in the case of string literals, it's handled by the runtime library which initializes and frees static data), then you might want to use a different string representation. Perhaps just a simple const char*.

Upvotes: 5

Karel Petranek
Karel Petranek

Reputation: 15154

The problem is that there is no way for the std::string class to recognize whether the const char* pointer is a global character literal or not:

const char *a = "Hello World";
const char *b = new char[20];

The char* pointer might get invalid at any time (for example when it's a local variable and the function/scope ends), thus std::string must become an exclusive owner of the string. This can only be achieved by copying.

The following example demonstrates why it is necessary:

std::string getHelloWorld()  {
  char *hello = new char[64];
  strcpy(hello, "Hello World");
  std::string result = (const char *)hello;  // If std::string didn't make a copy, the result could be a garbage
  delete[] hello;
  return result;
}

Upvotes: 10

Related Questions