Reputation: 53
first a word in advance: The following code should not be used as it is and is just the condense of working code to the critical point. The question is only where does the following code violate the standard (C++17, but C++20 is also fine) and if it doesn't whether the standard guarantees the "correct output"? It is not an example for beginners how to write code or anything like that, it is purely a question about the standard. (On request via pm: alternative version further below)
Assume for the following that the class Base
is never directly instantiated, but only via Derived<Size>
for some std::size_t Size
. Otherwise the undefined behaviour is obvious.
#include <cstddef>
struct Header
{ const std::size_t m_size; /* more stuff, remains standard layout */ };
struct alignas(Header) Base
{
std::size_t getCapacity()
{ return getHeader().m_size; }
std::byte *getBufferBegin() {
// Allowed by [basic.lval] (11.8)
return reinterpret_cast<std::byte *>(this);
// Does this give the same as the following code (which has to be commented out as Size is unknown):
// // Assume this "is actually an instance of Derived<Size>" for some Size, then
// // [expr.static.cast]-11 allows
// Derived<Size> * me_p = static_cast<Derived<Size> *>(this);
// // [basic.compound].4 + 4.3: say that
// // instances of standard-layout types and its first member are pointer-interconvertible:
// Derived<Size>::memory_type * data_p = reinterpret_cast<memory_type *>(me_p);
// Derived<Size>::memory_type & data = *data_p;
// // Degregation from array to pointer is allowed
// std::byte * begin_p = static_cast<std::byte *>(data);
// return begin_p;
}
std::byte * getDataMemory(int idx)
{
// For 0 <= idx < "Size", this is guaranteed to be valid pointer arithmetic
return getBufferBegin() + sizeof(Header) + idx * sizeof(int);
}
Header & getHeader()
{
// This is one of the two purposes of launder (see Derived::Derived for the in-place new)
return *std::launder(reinterpret_cast<Header *>(getBufferBegin()));
}
int & getData(int idx)
{
// This is one of the two purposes of launder (see Derived::Derived for the in-place new)
return *std::launder(reinterpret_cast<int*>(getDataMemory(idx)));
}
};
template<std::size_t Size>
struct Derived : Base
{
Derived() {
new (Base::getBufferBegin()) Header { Size };
for(int idx = 0; idx < Size; ++idx)
new (Base::getDataMemory(idx)) int;
}
~Derived() {
// As Header (and int) are trivial types, no need to call the destructors here
// as there lifetime ends with the lifetime of their memory, but we could call them here
}
using memory_type = std::byte[sizeof(Header) + Size * sizeof(int)];
memory_type data;
};
The question is not whether the code is nice, not whether you should do this, and not whether it will work in every single or any specific compiler - and please also forget alignment/padding for absurd compilers ;). Thus, please do not comment on style, whether one should do this, on missing const
etc or what to take care of when generalizing that (padding, alignment etc), but only
getBufferBegin
returns the begin of the buffer)Please be so kind to refer to the standard for any answer!
Thanks a lot
Chris
Edited: Both equivalent, answer what ever you like more... As there seems quite a lot of misunderstanding and nobody reading explaining comments :-/, let me "rephrase" the code in an alternative version containing the same questions. In three steps:
getDataN<100>(static_cast<void*>(&d));
and getData4(static_cast<Base*>(&d));
for an instance Derived<100> d
struct Data { /* ... remains standard layout, not empty */ };
struct alignas(Data) Base {};
template<std::size_t Size>
struct Derived { Data d; };
// Definitiv valid
template<std::size_t Size>
Data * getData1a(void * ptr)
{ return static_cast<Derived<Size>*>(ptr)->d; }
template<std::size_t Size>
Data * getData1b(Base * ptr)
{ return static_cast<Derived<Size>*>(ptr)->d; }
// Also valid: First element in standard layout
template<std::size_t Size>
Data * getData2(void * ptr)
{ return reinterpret_cast<Data *>(static_cast<Derived<Size>*>(ptr)); }
// Valid?
Data * getData3(void * ptr)
{ return reinterpret_cast<Data *>(ptr); }
// Valid?
Data * getData4(Base* ptr)
{ return reinterpret_cast<Data *>(ptr); }
getMemN<100>(static_cast<void*>(&d));
/getMem5(static_cast<Data*>(&d));
for an Data<100> d
template<std::size_t Size>
using Memory = std::byte data[Size];
template<std::size_t Size>
struct Data { Memory data; };
template<std::size_t Size>
std::byte *getMem1(void * ptr)
{ return &(static_cast<Data[Size]*>(ptr)->data[0]); }
// Also valid: First element in standard layout
template<std::size_t Size>
std::byte *getMem2(void * ptr)
{ return std::begin(*reinterpret_cast<Memory *>(static_cast<Data[Size]*>(ptr))); }
template<std::size_t Size>
std::byte *getMem3(void * ptr)
{ return static_cast<std::byte*>(*reinterpret_cast<Memory *>(static_cast<Data[Size]*>(ptr))); }
template<std::size_t Size>
std::byte *getMem4(void * ptr)
{ return *reinterpret_cast<std::byte**>(ptr); }
std::byte *getMem4(Data * ptr)
{ return *reinterpret_cast<std::byte**>(ptr); }
std::byte data[100];
new (std::begin(data)) std::int32_t{1};
new (std::begin(data) + 4) std::int32_t{2};
// ...
std::launder(reinterpret_cast<std::int32_t*>(std::begin(data))) = 3;
std::launder(reinterpret_cast<std::int32_t*>(std::begin(data) + 4)) = 4;
std::launder(reinterpret_cast<std::int32_t*>(std::begin(data))) = 5;
std::launder(reinterpret_cast<std::int32_t*>(std::begin(data) + 4)) = 6;
Upvotes: 0
Views: 392
Reputation: 53
The argumentation below that a base class of a standard layout class is pointer-interconvertible to a derived class is incorrect. More precisely, it holds only if the derived class wouldn't have any member (including member of a base class). Therefore, the strange discussion using C
is not working as C
inherits the members of Derived
and calls them members of C
.
As Base
and Derived
are not pointer interconvertible, the usage of std::launder
to access to the data of Derived
(see below) is against the standard as the object representation of Derived
is not accessible from the pointer to the Base
instance. So even if a pointer to Base
has the same value as a pointer to Derived
, the access via Base::getHeader
would not necessarily be defined behaviour - probably undefined behaviour as there is no reason to think otherwise.
Note: The compiler cannot assume that this data is not accessed via a Base
pointer, as the data is accessible after an static_cast
to Derived
and therefore no optimization may be applied to this data. However, it remains that it is undefined behaviour if you used an reinterpret_cast
(even if the value of the pointer is the same).
Question: Is there anything in the standard enforcing that a pointer to Derived
is also a pointer to Base
? They explicitly might have the same address, but are they guaranteed to? (at least for standard layout...).
Or put differently, is reinterpret_cast<Base*>(&d)
for a Derived d
a well-defined pointer to the base subobject? (Regardless of accessibility)
PS: With C++20, we have std::is_pointer_interconvertible_base_of
with which we can check, whether it holds for the given types.
Yes, the presented code is both well-defined and behaves as expected. Let us look at the critical methods Base::getBufferBegin
, Base::getData
, and Base::getHeader
one by one.
Base::getBufferBegin
First let us show a sequence of well-defined casts which will make the requested cast from the this pointer to a pointer to the first element in the array data
in the Derived
instance. And then secondly, show that the given reinterpret_cast
is well-defined and gives the right result. For simplification, forget about member functions as a first step.
using memory_type = std::byte[100];
Derived<100> & derived = /* what ever */;
Base * b_p {&derived}; // Definition of this, when calling a member function of base.
// 1) Cast to pointer to child: [expr.static.cast]-11
// "A prvalue of type “pointer to B”, where B is a class type, can be converted to a prvalue
// of type “pointer to D”, where D is a class derived(Clause 13) from B."
// allowing for B=Base, D=Derived<100>
auto * d_p = static_cast<Derived<100> *>(b_p);
// 3. Cast to first member (memory_type ) is valid and does not change the value
// [basic.compound].4 + 4.3: -> standard-layout and first member are
// pointer-interconvertible, so the following is valid:
memory_type * data_p = reinterpret_cast<memory_type *>(d_p);
// 4. Cast to pointer to first element is valid and does not change the value
// [dcl.array].1 "An object of array type contains a contiguously allocated non-empty set of
// N subobjects of type T."
// [intro.object].8 "Unless an object is a bit - field or a base class subobject of zero
// size, the address of that object is the address of the first byte it occupies."
// [expr.sizeof]. "When applied to an array, the result is the total number of bytes in the
// array. This implies that the size of an array of n elements is n times the size of an
// element." Thus, casting to the binary representation (by [basic.lval].11 always allowed!)
std::byte * begin_p = reinterpret_cast<std::byte *>(data_p); // Note: pointer to array!
// results in the same as std::byte * begin_p = std::begin(*data_p)
A reinterpret_cast
does not change the value of the given pointer², so if the first cast can be replaced by an reinterpret_cast
without changing the resulting value, then the result of the above gives the same value as std::byte * begin_p = reinterpret_cast<std::byte *>(b_p);
[basic.compound]
.4 + 4.3 says (rephrased) Pointer to a instance of a standard-layout class without members is pointer-interconvertible has same address as any of its base classes. Thus, if
Missing: C
would be a standard layout child class of Derived<100>, then a pointer to an instance of C
would be pointer-interconvertible to the a pointer to the sub-object Derived<100>
and to one to the sub-object Base
. By transitivity of pointer-interconvertibility ([basic.compound]
.4.4) a pointer to Base
is pointer-interconvertible to a pointer to Derived<100>
if such a class existed. Either we define C<Size>
to be such a class and use C<100>
instead of Derived<Size>
or be just accept that it is not predictable from any object file whether there could be such a class C
, so the only way to ensure this is that these two are pointer-interconvertible regardless of such a class C
(and its existence). In particular, auto * d_p = reinterpret_cast<Derived<100> *>(b_p);
can be used instead of the static_cast
.auto * d_p = reinterpret_cast<Derived<100> *>(b_p);
can be used instead of the static_cast
.
Last step for Base;;getBuferBegin
, can we replace all the above by *reinterpret_cast<std::byte*>(b_p);
. First of all, yes we are allowed to do this cast as casting to the binary representation (by [basic.lval].11) is always allowed and does not change the value of the pointer²! Secondly, this cast gives the same result as we just have shown that the casts above can all be replaced by reinterpret_cast
s (not changing the value²).
All in all this shows that Base::getBufferBegin()
is well-defined and behaves as expected (the returned pointer points to the first element in the buffer data of the child class).
Base::getHeader
The constructor of Derived<Size>
constructs a header instance at the first byte of the array data. By the above The question remains, whether we are allowed to access the header through this pointer. For simplicity, let me cite cppreference here (ensuring that the same is in the standard (but less understandable)): Citing the "notes" from thereBase::getBufferBegin
gives a pointer to exactly this byte.
Typical uses of std::launder include: [...] Obtaining a pointer to an object created by placement new from a pointer to an object providing storage for that object.
Which is exactly what we are doing here, so everything is fine, isn't it? No, not yet. Looking at the requirements of std::launder
, we need that "every byte that would be reachable through the result is reachable through p [the given pointer]". But is this the case here? The answer is yes, it surprisingly is. By the above argumentation (just search for [basic.compound].4.4 ;)) gives that a pointer to Base
is pointer-interconvertible to Derived<Size>
. Per definition of reachability of a byte via an pointer, this means that the full binary representation of Derived<Size>
is reachable by a pointer to Base
(note that this is only true for standard layout classes!). Thus, reinterpret_cast<Header*>(this);
gives a pointer to the Header
-instance through which every byte of the binary representation of Header
is reachable, satisfying the conditions of std::launder
. Thus, std::launder
(being a noop) results in a valid object pointer to header.
Missing: Binary representation of Derived
reachable through point to Base
(no static_cast
usage!)
Do we need that std::launder
? Yes, formally we do as this reinterpret_cast
contains two casts between not pointer-interconvertible object, being (1) the pointer to the array and the pointer to its first element (which seems to be the most trivial one in the full discussion) and (2) the pointer to the binary representation of Header
and the pointer to the object Header
and the fact that header is standard layout does not change anything!
Base::getData
See Base::getHeader
with the sole addition that we are allowed to do the given pointer arithmetic (for 0<=idx
and idx<=Size
) as the given pointer points to the first element of the array data
and the full array data
is reachable through the pointer (see above).
Done.
Certification of a compiler ensures that we can rely on it doing what the standard says (and nothing more). By the above, the standard says that we are allowed to do this stuff.
Get a reference to a non-trivial container (eg list, map) to a static memory buffer without
Header
and the type stored in the buffer are standard layout, too),The latter two being needed as the structure is sent around via ipc.
²: Yes, reinterpret_cast
of a pointer type to another pointer type does not change the value. Everybody assumes that, but it is also in the standard ([expr.static.cast].13):
Blockquote If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. (...) Otherwise, the pointer value is unchanged by the conversion.
That shows that static_cast<T*>(static_cast<void*>(u))
does not change the pointer and by [expr.reinterpret.cast].7 this is equivalent to the corresponding reinterpret_cast
Upvotes: 1