Guillaume Paris
Guillaume Paris

Reputation: 10539

guarantee of reinterpret_cast output for serialization purpose

int main()
{
    char buffer[5] = { 0 };
    buffer[0] = 23;

    std::string s(&buffer[0], 4);
    std::uint32_t nb = *reinterpret_cast<const std::uint32_t*>(s.data());

    return 0;
}

For this program, is reinterpret_cast's output implementation dependent? Or will any compiler conforming to the c++ standard always produce the same output?

Upvotes: 0

Views: 695

Answers (2)

Cheers and hth. - Alf
Cheers and hth. - Alf

Reputation: 145419

You're casting to std::uint32_t a buffer that is not necessarily properly aligned for such a value.

That's likely to blow up and/or be hugely inefficient.

The unsigned integer type means that any bitpattern for the value representation bits is OK, and on the PC platform for built-in type there are no bits other than the value representation bits; in particular no trap bits or trapping total bitpatterns.

Thus, you can do a memcpy and you'll be fine, technically – provided there are enough bytes, that s.length() >= sizeof(std::uint32_t).

However, such a conversion, if it occurred in ordinary code, would be a strong code-smell, an indication of something fundamentally wrong in the design.


Addendum, regarding “Or a compiler respectfull to the c++ standard will always produce the same output”.

I somehow didn’t see that when I answered. But the short answer is that if the conversion is performed in a way that works, such as using memcpy, then it depends on the endianness, a.k.a. byte order, in practice whether the most significant or least significant part of an integer is placed at lowest address.

In practice you can use network-oriented functions that convert to from network byte order. Just assume network byte order for the serialized data. Check out ntohl et al (these are not part of the C++ standard library, but commonly available).

Upvotes: 2

Michael Burr
Michael Burr

Reputation: 340366

For your example code, if you're looking for something that "any compiler conforming to the c++ standard always produce the same output", the answer is that there's no such guarantee.

A couple easy examples: alignment issues (as mentioned in several comments) and endianness differences.

C++11 5.2.10/7 "Reinterpret cast" says:

An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void. Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.

Since uint32_t will generally have a stricter alignment requirement than char[], the standard doesn't make any promises about the behavior (since the above only talks about the situation where the alignment requirements are met). So strictly speaking the behavior is undefined.

Now, lets assume that you're interested only in platforms where the alignment requirements are met (ie., uint32_t can be aligned on any address, same as char). Then your expression involving the reinterpret cast is equivalent to (note that you'd have to cast away the const from the const char* returned from std::string::data() as well):

std::uint32_t nb = *(static_cast<std::uint32_t*>(static_cast<void*>(const_cast<char*>(s.data()))));

The standard says this about using static_cast with object pointers (other than conversion between pointers in a class heirarchy) in 5.2.9/13 "Static cast":

A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T,” where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. The null pointer value is converted to the null pointer value of the destination type. A value of type pointer to object converted to “pointer to cv void” and back, possibly with different cv-qualification, shall have its original value.

So, as far as the standard is concerned, all that you can do with the resulting pointer is cast it back to get the original value. Anything else would be undefined behavior (that an implementation might give a better guarantee on).

3.10/10 "Lvalues and rvalues" allows an object to be accessed through char or unsigned char types as well.

However, to reiterate: the standard does not guarantee that "any compiler conforming to the c++ standard always produce the same output" for the example you posted.

Upvotes: 2

Related Questions