Reputation: 851
I'm looking to eliminate special case in algorithm, and in order to do so I'd like to be sure that std::string::npos + 1
is 0
.
This is true in all the implementations I have tested, and searching in the standard I have found the following:
namespace std {
template<class charT, class traits = char_traits<charT>,
class Allocator = allocator<charT> >
class basic_string {
// ...
typedef typename allocator_traits<Allocator>::size_type size_type;
// ...
static const size_type npos = -1;
...
};
}
allocator_traits<Allocator>::size_type
is defined as an unsigned integer type.
So, I suppose, my question boils down to whether (static_cast<T>(-1)) + 1
always gives 0
for unsigned integer type T?
Upvotes: 7
Views: 969
Reputation: 2984
I'd say, that from C++20 on, this is practically guaranteed. First of all, before C++20, the bit layout of signed integers was unspecified. An implementation COULD have represented negative integers with a negative sign bit on top and then the absolute value behind. In that format, the integer -1 would have had the bit representation 1000...0001
(all zero bits, with a single one bit both at the front and at the end). If that is casted to unsigned
and then 1 is added to it, it would become 1000...0010
, which of course is not zero.
C++20 introduced the rule, that signed integers must use two's complement, so the bit representation of -1 is now guaranteed to be 111...111
(every bit is one). Adding 1 to such a value will flip all bits to zero (and set the "carry bit" in the processor's flag register, but that extra bit is ignored by C++).
However, what I just said, is only true, if the unsigned type behind std::basic_string::npos
, which is std::basic_string::size_type
, has at least as many bits as int
. Because otherwise, the value of npos
would be padded with zero bits in front before performing the addition and the result would be "000...0001000....000", with the second group of zeros as long as the original size of npos
before the padding occurred. However, for all practical C++ implementations out there, the size of pointer difference types like std::basic_string::size_type
is at least at that of int
. And making int
longer than std::size_t
would break so much code, that nobody is ever going to do that.
Furthermore, in over 99% of the cases, you are going to feed npos + 1
into one of the string functions, that take a std::basic_string::size_type
as argument. Even, if npos + 1
got sign extended, it will be truncated to a std::basic_string::size_type
and thus to zero in such usage.
For example, use this code to cut out the word after the last space in a string:
std::string words = "no-space";
std::string lastword = words.substr(words.rfind(' ') + 1);
If you want to completely eliminate the 0.000...1% risk, that the effectively mentioned in the beginning doesn't hold in your edge case, just test for it:
static_assert (std::string::npos + 1 == 0, "unexpected behavior of std::string::npos")
Upvotes: 0
Reputation:
No, adding one to the largest value of an unsigned integer type is not guaranteed to give you zero.
If size_type
is defined as unsigned short
, and int
is wider than unsigned short
, the LHS of your addition will be converted to the RHS, and you rely on the addition being performed in the LHS's type.
Worse yet, but this is far less likely to actually happen, if unsigned short
has exactly the same number of value bits as int
, the addition would overflow and result in undefined behaviour.
What you can do, however, is add 1U
, and convert the result of the addition back to T
. That should generate exactly the same code on common implementations, but would be guaranteed to be valid for all.
Upvotes: 5