Reputation: 20746
I have an object of the std::string
class that I need to pass to C function that operates the char*
buffer by iterating over it and searching for the null terminated symbol.
So, I have something like this:
// C function
void foo(char* buf);
// C++ code
std::string str("str");
foo(&str[0]);
Suppose that we use C++11, so we have a guarantee that std::string
representation will have contiguously stored characters.
But I wonder is there any guarantee that &str[0]
will point to the buffer that ends with \0
? Yeah, there's c_str
member function but I'm talking about operator[]
.
Can somebody quote the standard please?
Upvotes: 5
Views: 627
Reputation: 5209
According to standard, yes. Underlying char container is accessible using string::data
or string::c_str
on which standard says:
21.4.7.1
basic_string
accessors [string.accessors]
const charT* c_str() const noexcept;
const charT* data() const noexcept;
1 Returns: A pointer
p
such thatp + i == &operator[](i)
for eachi
in[0,size()]
.
2 Complexity: Constant time.
3 Requires: The program shall not alter any of the values stored in the character array.
And to prove, that it's null-terminated, look at definition of operator[]
(emphasis mine):
21.4.5
basic_string
element access [string.access]
const_reference operator[](size_type pos) const;
reference operator[](size_type pos);
1 Requires:
pos <= size().
2 Returns:*(begin() + pos)
ifpos < size()
. Otherwise, returns a reference to an object of typecharT
with valuecharT()
, where modifying the object leads to undefined behavior.
3 Throws: Nothing.
4 Complexity: Constant time.
thus operator[size()]
returns charT()
and since std::string
is std::basic_string<char>
, charT()
is '\0'
.
That means, in your case, *(&str[0] + str.size()) == '\0'
shall be, according to standard, always true
.
Beware, that modifying operator[size()]
is UB.
Upvotes: 5
Reputation: 275510
In practice, yes. There are exactly zero implementations of std::string
that are standards-comforming that do not store a NUL character at the end of the buffer.
So if you aren't wondering for wondering sake, you are done.
However, if you are wondering about the standard being abtruse:
In C++14, yes. There is a clear requirement that []
return a contiguous set of elements, and [size()]
must return a NUL character, and const methods may not modify state. So *((&str[0])+size())
must be the same as str[size()]
, and str[size()]
must be a NUL, thus game over.
In C++11, almost certainly. There are rules that const
methods may not modify state. There are guarantees that data()
and c_str()
return a null-terminated buffer that agrees with []
at each point.
A convoluted reading of C++11 standard would state that prior to any call of data()
or c_str()
, [size()]
doesn't return the NUL terminator at the end of the buffer but rather a static const CharT
that is stored separately, and the buffer has an unitialized (or even a trap value) where NUL should be. Due to the requirement that const
methods not modify state I believe this reading is incorrect.
This requires &str[str.size()]
change between calls to .data()
, which is an observable change in state in string
over a const
call, which I would read as being illegal.
An alternative way to get around the standard might be to not initialize str[str.size()]
until you legally access it via calling .data()
, .c_str()
or actually passing str.size()
to operator[]
. As there are no defined ways to access that element other than those 3 in the standard, you could stretch things and say lazy initialization of the NUL is legal.
I'd question this, as the definition of .data()
implies that the return value of []
is contiguous, so &[0]
is the same address as .data()
, and .data()+.size()
is guaranteed to point to a NUL CharT
so must (&[0])+.size()
, and with no non-const
methods called the state of the std::string
may not change between the calls.
But, what if the fact the compiler can look and see you'll never call .data()
or .c_str()
, does the requirement of contiguity hold if it can be proven you never call them?
At which point I'd throw my hands up and shoot the hostile compiler.
The standard is very passively voiced about this. So there may be a way to make an arguably standards conforming std::string
that doesn't follow these rules. And because the guarantees get closer and closer to explicitly requiring that NUL terminator there, the odds against a new compiler showing up that uses a tortured reading of C++ to claim this is standards compliant is low.
Upvotes: 7