Reputation: 780

Why std::string does not have (explicit) const char* cast

I like to know pro's and con's for having and not-having such cast. At several places including here on Stack Overflow I can see that the const char* cast is considered bad idea but I am not sure why?

Lack of the (const char*) and forcing to always use c_str() cast creates some problems when writing generic routines and templates.

void CheckStr(const char* s)
{
}

int main()
{
    std::string s = "Hello World!";

    // all below will not compile with 
    // Error: No suitable conversion function from "std::string" to "const char *" exists!
    //CheckStr(s);             
    //CheckStr((const char*)s);  
    // strlen(s);

    // the only way that works
    CheckStr(s.c_str());
    size_t n = strlen(s.c_str());
    return 0;
}

For example, if I have a large number of text processing functions that accept const char* as input and I want to be able to use std::string each time I have to use c_str(). But in this way a template function can't be used for both std::string and const char* without additional efforts.

As a problem I can see some operator overloading issues but these are possible to solve.

For example, as [eerorika] pointed, with allowing implicit cast to pointer we are allowing involuntary the string class to be involved in boolean expressions. But we can easily solve this with deleting the bool operator. Even further, the cast operator can be forced to be explicit:

class String
{
public:
    String() {}
    String(const char* s) { m_str = s; }
    const char* str() const  { return m_str.c_str(); }
    char* str()  { return &m_str[0]; }
    char operator[](int pos) const { return m_str[pos]; }
    char& operator[](int pos) { return m_str[pos]; }
    explicit operator const char*() const { return str(); }  // cast operator
    operator bool() const = delete;

protected:
    std::string m_str;
};

int main()
{
    String s = "Hello";
    string s2 = "Hello";
    if(s)  // will not compile:  it is a deleted function
    {
        cout << "Bool is allowed " << endl;
    }

    CheckStr((const char*)s);
    return 0;
}

Upvotes: 1

Answers (2)

AnT stands with Russia

Reputation: 320531

If by "having a cast" you mean a user defined conversion operator, then the reason it does not have it is: to prevent you from using it implicitly, possibly inadvertently.

Historically, unpleasant consequences of an inadvertent use of such conversion stem the fact that in the original std::string (per C++98 specification) the operation was heavy and dangerous.

The original std::string was not trivially convertible to const char *, since the string object was not originally intended/required to store a null-terminator character. Under those circumstances, conversion to const char * was a potentially heavy operation that generally allocated an independent buffer and copied the entire controlled sequence to that buffer.
The independent buffer mentioned above (if used) had potentially "unexpected" lifetime. Any modifying operation on the original std::string object triggered invalidation/deallocation of that buffer, rendering previously returned pointers invalid.

It is never a good idea to implement such heavy and dangerous operations as implicitly-invokable conversion operators.

The original C++ standard (C++98) did not have such feature as explicit conversion operators. (They first appeared in C++11.) A dedicated named member function was the only way to somehow make the conversion explicit in C++98.

Today, in modern C++, we can define a conversion operator and still prevent it from being used implicitly (by using explicit keyword). One can argue that under such circumstances implementing the conversion by an operator is a reasonable approach. But I'd still argue that it is not a good idea. Even though the modern std::string is required to store its null-terminator (i.e. c_str() no longer produces an independent buffer), the pointer returned by the conversion to const char * is still "dangerous": many modification operations applied to std::string object may (and will) invalidate this pointer. To emphasize the fact that this is not a mere safe and innocent conversion, but rather an operation that produces a potentially dangerous pointer, it is quite reasonable to implement it by a named function.

Upvotes: 4

eerorika

Reputation: 238351

I like to know pro's and con's for having and not-having such cast.

Con: Implicit conversions often have behaviour that is surprising to the programmer.

For example, what would you expect from following program?

std::string some_string = "";
if (some_string)
    std::cout << "true";
else
    std::cout << "false";

Should the program be ill-formed because std::string is has no conversion to bool? Should the result depend on the content of the string? Would most programmers have the same expectation?

With the current std::string, the above would be ill-formed because there is no such conversion. This is good. Whatever the programmer expected, they'll find out their misunderstanding when they attempt to compile.

If std::string had a conversion to a pointer, then there would also be a conversion sequence to bool through the conversion to pointer. The above program would be well-formed. And the program would print true regardless of the content of the string, since c_str is never null. What if programmer instead expected that empty string would be false? What if they never intended either behaviour, but used a string there by accident?

What about the following program?

std::string some_string = "";
std::cout << some_string + 42;

Would you expect the program to be ill-formed because there is no such operator for string and int?

If there was implicit conversion to char*, the above would have undefined behaviour because it does pointer arithmetic and accesses the string buffer outside of its bounds.

// all below will not compile with 
strlen(s);

This is actually a good thing. Most of the time, you don't want to call strlen(s). Usually, you should use s.size() because it is asymptotically faster. The need for strlen(s.c_str()) is so rare, that the little bit of verbosity is insignificant.

Forcing the use of .c_str() is great because it shows the reader of the program that it is not a std::string that is passed to the function / operator, but a char*. With implicit conversion, it is not possible to distinguish one from the other.

... creates some problems when writing generic routines and templates.

Such problems are not insurmountable.

Upvotes: 6

Why std::string does not have (explicit) const char* cast

Answers (2)

Related Questions