Reputation: 5472
Reading https://en.cppreference.com/w/cpp/language/reinterpret_cast I wonder what are use-cases of reinterpret_cast
that are not UB and are used in practice?
The above description contains many cases where it is legal to convert a pointer to some other type and then back, which is legal. But that seems of less practical use. Accessing an object through a reinterpret_cast
pointer is mostly UB due to violations of strict-aliasing (and/or alignment), except accessing through a char*
/byte*
-pointer.
One helpful exception is casting an integer-constant to a pointer and accessing the target object, which is useful for manipulation of HW-registers (in µC).
Can anyone tell some real use-cases of relevance of reinterpret_cast that are used in practice?
Upvotes: 26
Views: 2951
Reputation: 76829
Some examples that come to mind:
Reading/writing the object representation of a trivially-copyable object, for example to write the byte representation of the object to a file and read it back:
// T must be trivially-copyable object type!
T obj;
//...
std::ofstream file(/*...*/);
file.write(reinterpret_cast<char*>(&obj), sizeof(obj));
//...
std::ifstream file(/*...*/);
file.read(reinterpret_cast<char*>(&obj), sizeof(obj));
Technically it is currently not really specified how accessing the object representation should work aside from directly passing on the pointer to memcpy
et. al, but there is a current proposal for the standard to clarify at least how reading (but not writing) individual bytes in the object representation should work, see https://github.com/cplusplus/papers/issues/592.
Reinterpreting between signed and unsigned variants of the same integral type, especially char
and unsigned char
for strings, which may be useful if an API expects an unsigned string.
auto str = "hello world!";
auto unsigned_str = reinterpret_cast<const unsigned char*>(str);
While this is allowed by the aliasing rules, technically pointer arithmetic on the resulting unsigned_str
pointer is currently not defined by the standard. But I don't really see why it isn't.
Accessing objects nested within a byte buffer (especially on the stack):
alignas(T) std::byte buf[42*sizeof(T)];
new(buf+sizeof(T)) T;
// later
auto ptr = std::launder(reinterpret_cast<T*>(buf + sizeof(T)));
This works as long as the address buf + sizeof(T)
is suitably aligned for T
, the buffer has type std::byte
or unsigned char
, and obviously is of sufficient size. The new
expression also returns a pointer to the object, but one might not want to store that for each object. If all objects stored in the buffer are the same type, it would also be fine to use pointer arithmetic on a single such pointer.
Obtaining a pointer to a specific memory address. Whether and for which address values this is possible is implementation-defined, as is any possible use of the resulting pointer:
auto ptr = reinterpret_cast<void*>(0x12345678);
Casting a void*
returned by dlsym
(or a similar function) to the actual type of a function located at that address. Whether this is possible and what exactly the semantics are is again implementation-defined:
// my_func is a C linkage function with type `void()` in `my_lib.so`
// error checking omitted!
auto lib = dlopen("my_lib.so", RTLD_LAZY);
auto my_func = reinterpret_cast<void(*)()>(dlsym(lib, "my_func");
my_func();
Various round-trip casts may be useful to store pointer values or for type erasure.
Round-trip of an object pointer through void*
requires only static_cast
on both sides and reinterpret_cast
on object pointers is defined in terms of a two-step static_cast
through (cv-qualified)void*
anyway.
Round-trip of an object pointer through std::uintptr_t
, std::intptr_t
, or another integral type large enough to hold all pointer values may be useful for having a representation of the pointer value that can be serialized (although I am not sure how often that really is useful). It is however implementation-defined whether any of these types exist. Typically they will, but exotic platforms where memory addresses cannot be represented as single integer values or all integer types are too small to cover the address space are permitted by the standard. I would also be vary of pointer analysis of the compiler causing issues depending on how you use this, see e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 as just the first bug report I found. The standard isn't particularly clear on how the integer -> pointer cast is supposed to work especially when considering pointer provenance. See for example https://open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2318r1.pdf and other documents linked therein.
Round-trip of a function pointer through any arbitrary function pointer type (likely void(*)()
) may be useful to erase the type from arbitrary functions, although again I am not sure how often that really is useful. void*
type-erased arguments are common in C APIs when a function just passes through data, but type-erased function pointers like that are less common.
A round-trip cast of a function pointer through void*
may be used in a similar way as above, as dlsym
essentially does with the additional dynamic library complication. This is conditionally-supported only, although it is effectively required for POSIX systems. (It is not generally supported, because object and function pointer values may have distinct representations, size, alignment etc. on some more exotic platforms.)
Upvotes: 35
Reputation: 81247
The most common situations where reinterpret_cast
is used without undefined behavior involve dialects which extend the C++ language by specifying how they will behave in more situations than mandated by the Standard (i.e. defining the behavior). Although the C++ Standard would allow implementations to treat programs that "violate" the aliasing rules as erroneous, the Standard doesn't require that such programs be viewed that way. According to the C++ Standard itself:
Although this document states only requirements on C++ implementations, those requirements are often easier to understand if they are phrased as requirements on programs, parts of programs, or execution of programs.... If a program contains a violation of a rule for which no diagnostic is required, this document places no requirement on implementations with respect to that program.
Nearly all practical implementations can be configured to extend the semantics of the language by processing reinterpret_cast
in a manner consistent with the representations of the objects involved without regard for whether the Standard would require that they do so. The fact that reinterpret_cast
allows a consistent syntax for non-portable constructs that exploit such extensions is more broadly useful than most of the "portable" ways the construct may be used.
Upvotes: 4
Reputation: 614
Another real-world example of using reinterpret_cast
is using the various network related functions that accept struct sockaddr *
parameter, namely recvfrom()
, bind()
or accept()
.
For example, following is the definition of the recvfrom function:
ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
struct sockaddr *src_addr, socklen_t *addrlen);
Its fifth argument is defined as struct sockaddr *src_addr
and it acts as a general interface for accepting a pointer to a structure of a specific address type (for example, sockaddr_in
or sockaddr_in6
).
The Beej's Guide to Network Programming says:
In memory, the struct sockaddr_in and struct sockaddr_in6 share the same beginning structure as struct sockaddr, and you can freely cast the pointer of one type to the other without any harm, except the possible end of the universe.
Just kidding on that end-of-the-universe thing…if the universe does end when you cast a struct sockaddr_in* to a struct sockaddr*, I promise you it’s pure coincidence and you shouldn’t even worry about it.
So, with that in mind, remember that whenever a function says it takes a struct sockaddr* you can cast your struct sockaddr_in*, struct sockaddr_in6*, or struct sockadd_storage* to that type with ease and safety.
For example:
int fd; // file descriptor value obtained elsewhere
struct sockaddr_in addr {};
socklen_t addr_len = sizeof(addr);
std::vector<std::uint8_t> buffer(4096);
const int bytes_recv = recvfrom(fd, buffer.data(), buffer.size(), 0,
reinterpret_cast<sockaddr*>(&addr), &addr_len);
Upvotes: 10