Reputation: 10539
int main()
{
char buffer[5] = { 0 };
buffer[0] = 23;
std::string s(&buffer[0], 4);
std::uint32_t nb = *reinterpret_cast<const std::uint32_t*>(s.data());
return 0;
}
For this program, is reinterpret_cast's output implementation dependent? Or will any compiler conforming to the c++ standard always produce the same output?
Upvotes: 0
Views: 695
Reputation: 145419
You're casting to std::uint32_t
a buffer that is not necessarily properly aligned for such a value.
That's likely to blow up and/or be hugely inefficient.
The unsigned integer type means that any bitpattern for the value representation bits is OK, and on the PC platform for built-in type there are no bits other than the value representation bits; in particular no trap bits or trapping total bitpatterns.
Thus, you can do a memcpy
and you'll be fine, technically – provided there are enough bytes, that s.length() >= sizeof(std::uint32_t)
.
However, such a conversion, if it occurred in ordinary code, would be a strong code-smell, an indication of something fundamentally wrong in the design.
Addendum, regarding “Or a compiler respectfull to the c++ standard will always produce the same output”.
I somehow didn’t see that when I answered. But the short answer is that if the conversion is performed in a way that works, such as using memcpy
, then it depends on the endianness, a.k.a. byte order, in practice whether the most significant or least significant part of an integer is placed at lowest address.
In practice you can use network-oriented functions that convert to from network byte order. Just assume network byte order for the serialized data. Check out ntohl
et al (these are not part of the C++ standard library, but commonly available).
Upvotes: 2
Reputation: 340366
For your example code, if you're looking for something that "any compiler conforming to the c++ standard always produce the same output", the answer is that there's no such guarantee.
A couple easy examples: alignment issues (as mentioned in several comments) and endianness differences.
C++11 5.2.10/7 "Reinterpret cast" says:
An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue
v
of type “pointer toT1
” is converted to the type “pointer to cvT2
”, the result isstatic_cast<cv T2*>(static_cast<cv void*>(v))
if bothT1
andT2
are standard-layout types (3.9) and the alignment requirements ofT2
are no stricter than those ofT1
, or if either type isvoid
. Converting a prvalue of type “pointer toT1
” to the type “pointer toT2
” (whereT1
andT2
are object types and where the alignment requirements ofT2
are no stricter than those ofT1
) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.
Since uint32_t
will generally have a stricter alignment requirement than char[]
, the standard doesn't make any promises about the behavior (since the above only talks about the situation where the alignment requirements are met). So strictly speaking the behavior is undefined.
Now, lets assume that you're interested only in platforms where the alignment requirements are met (ie., uint32_t
can be aligned on any address, same as char
). Then your expression involving the reinterpret cast is equivalent to (note that you'd have to cast away the const
from the const char*
returned from std::string::data()
as well):
std::uint32_t nb = *(static_cast<std::uint32_t*>(static_cast<void*>(const_cast<char*>(s.data()))));
The standard says this about using static_cast
with object pointers (other than conversion between pointers in a class heirarchy) in 5.2.9/13 "Static cast":
A prvalue of type “pointer to cv1
void
” can be converted to a prvalue of type “pointer to cv2T
,” whereT
is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. The null pointer value is converted to the null pointer value of the destination type. A value of type pointer to object converted to “pointer to cvvoid
” and back, possibly with different cv-qualification, shall have its original value.
So, as far as the standard is concerned, all that you can do with the resulting pointer is cast it back to get the original value. Anything else would be undefined behavior (that an implementation might give a better guarantee on).
3.10/10 "Lvalues and rvalues" allows an object to be accessed through char
or unsigned char
types as well.
However, to reiterate: the standard does not guarantee that "any compiler conforming to the c++ standard always produce the same output" for the example you posted.
Upvotes: 2