user6331143
user6331143

Reputation: 1

Does implementation-definedness of char affect std::string?

I thought all types were signed unless otherwise specified (like int). I was surprised to find that for char it's actually implementation-defined:

... It is implementation-defined whether a char object can hold negative values. ... In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

However std::string is really just std::basic_string<char, ...>.

Can the semantics of this program change from implementation?

#include <string>

int main()
{
    char c = -1;
    std::string s{1, c};
}

Upvotes: 0

Views: 61

Answers (1)

Keith Thompson
Keith Thompson

Reputation: 263577

Yes and no.

Since a std::string contains objects of type char, the signedness of type char can affect its behavior.

The program in your question:

#include <string>

int main()
{
    char c = -1;
    std::string s{1, c};
}

has no visible behavior (unless terminating without producing any output is "behavior"), so its behavior doesn't depend on the signedness of plain char. A compiler could reasonably optimize out the entire body of main. (I'm admittedly nitpicking here, commenting on the code example you picked rather than the question you're asking.)

But this program:

#include <iostream>
#include <string>

int main() {
    std::string s = "xx";
    s[0] = -1;
    s[1] = +1;
    std::cout << "Plain char is " << (s[0] < s[1] ? "signed" : "unsigned") << "\n";
}

will correctly print either Plain char is signed or Plain char is unsigned.

Note that a similar program that compares two std::string objects using that type's operator< does not distinguish whether plain char is signed or unsigned, since < treats the characters as if they were unsigned, similar to the way C's memcmp works.

But this shouldn't matter 99% of the time. You almost certainly have to go out of your way to write code whose behavior depends on the signedness of char. You should keep in mind that it's implementation-defined, but if the signedness matters, you should be using signed char or (more likely) unsigned char explicitly. char is a numeric type, but you should use it to hold character data.

Upvotes: 2

Related Questions