Reputation: 780
I like to know pro's and con's for having and not-having such cast. At several places including here on Stack Overflow I can see that the const char*
cast is considered bad idea but I am not sure why?
Lack of the (const char*)
and forcing to always use c_str()
cast creates some problems when writing generic routines and templates.
void CheckStr(const char* s)
{
}
int main()
{
std::string s = "Hello World!";
// all below will not compile with
// Error: No suitable conversion function from "std::string" to "const char *" exists!
//CheckStr(s);
//CheckStr((const char*)s);
// strlen(s);
// the only way that works
CheckStr(s.c_str());
size_t n = strlen(s.c_str());
return 0;
}
For example, if I have a large number of text processing functions that accept const char*
as input and I want to be able to use std::string
each time I have to use c_str()
. But in this way a template function can't be used for both std::string
and const char*
without additional efforts.
As a problem I can see some operator overloading issues but these are possible to solve.
For example, as [eerorika] pointed, with allowing implicit cast to pointer we are allowing involuntary the string class to be involved in boolean expressions. But we can easily solve this with deleting the bool operator. Even further, the cast operator can be forced to be explicit:
class String
{
public:
String() {}
String(const char* s) { m_str = s; }
const char* str() const { return m_str.c_str(); }
char* str() { return &m_str[0]; }
char operator[](int pos) const { return m_str[pos]; }
char& operator[](int pos) { return m_str[pos]; }
explicit operator const char*() const { return str(); } // cast operator
operator bool() const = delete;
protected:
std::string m_str;
};
int main()
{
String s = "Hello";
string s2 = "Hello";
if(s) // will not compile: it is a deleted function
{
cout << "Bool is allowed " << endl;
}
CheckStr((const char*)s);
return 0;
}
Upvotes: 1
Views: 881
Reputation: 320531
If by "having a cast" you mean a user defined conversion operator, then the reason it does not have it is: to prevent you from using it implicitly, possibly inadvertently.
Historically, unpleasant consequences of an inadvertent use of such conversion stem the fact that in the original std::string
(per C++98 specification) the operation was heavy and dangerous.
The original std::string
was not trivially convertible to const char *
, since the string object was not originally intended/required to store a null-terminator character. Under those circumstances, conversion to const char *
was a potentially heavy operation that generally allocated an independent buffer and copied the entire controlled sequence to that buffer.
The independent buffer mentioned above (if used) had potentially "unexpected" lifetime. Any modifying operation on the original std::string
object triggered invalidation/deallocation of that buffer, rendering previously returned pointers invalid.
It is never a good idea to implement such heavy and dangerous operations as implicitly-invokable conversion operators.
The original C++ standard (C++98) did not have such feature as explicit
conversion operators. (They first appeared in C++11.) A dedicated named member function was the only way to somehow make the conversion explicit in C++98.
Today, in modern C++, we can define a conversion operator and still prevent it from being used implicitly (by using explicit
keyword). One can argue that under such circumstances implementing the conversion by an operator is a reasonable approach. But I'd still argue that it is not a good idea. Even though the modern std::string
is required to store its null-terminator (i.e. c_str()
no longer produces an independent buffer), the pointer returned by the conversion to const char *
is still "dangerous": many modification operations applied to std::string
object may (and will) invalidate this pointer. To emphasize the fact that this is not a mere safe and innocent conversion, but rather an operation that produces a potentially dangerous pointer, it is quite reasonable to implement it by a named function.
Upvotes: 4
Reputation: 238351
I like to know pro's and con's for having and not-having such cast.
Con: Implicit conversions often have behaviour that is surprising to the programmer.
For example, what would you expect from following program?
std::string some_string = "";
if (some_string)
std::cout << "true";
else
std::cout << "false";
Should the program be ill-formed because std::string
is has no conversion to bool
? Should the result depend on the content of the string? Would most programmers have the same expectation?
With the current std::string
, the above would be ill-formed because there is no such conversion. This is good. Whatever the programmer expected, they'll find out their misunderstanding when they attempt to compile.
If std::string
had a conversion to a pointer, then there would also be a conversion sequence to bool through the conversion to pointer. The above program would be well-formed. And the program would print true
regardless of the content of the string, since c_str
is never null. What if programmer instead expected that empty string would be false? What if they never intended either behaviour, but used a string there by accident?
What about the following program?
std::string some_string = "";
std::cout << some_string + 42;
Would you expect the program to be ill-formed because there is no such operator for string and int
?
If there was implicit conversion to char*
, the above would have undefined behaviour because it does pointer arithmetic and accesses the string buffer outside of its bounds.
// all below will not compile with strlen(s);
This is actually a good thing. Most of the time, you don't want to call strlen(s)
. Usually, you should use s.size()
because it is asymptotically faster. The need for strlen(s.c_str())
is so rare, that the little bit of verbosity is insignificant.
Forcing the use of .c_str()
is great because it shows the reader of the program that it is not a std::string
that is passed to the function / operator, but a char*
. With implicit conversion, it is not possible to distinguish one from the other.
... creates some problems when writing generic routines and templates.
Such problems are not insurmountable.
Upvotes: 6