Reputation: 583
So I was reading P2590R2 which introduces std::start_lifetime_as
, and one of the sections is making me question something that I had previously thought was defined behaviour, so I was hoping someone might be able to clarify.
If I have some code like this:
struct WString_header
{
WString_header(std::wstring_view str)
: m_size(str.size())
{
// this shouldn't be undefined behaviour so long as
// the storage we're being allocated in (i.e. an array of bytes)
// can actually fit this array?
wchar_t *chars = new (this + 1) wchar_t[str.size()];
std::copy(str.begin(), str.end(), chars);
}
size_t size() const
{
return m_size;
}
const wchar_t *chars() const
{
//undefined behaviour?
return reinterpret_cast<const wchar_t *>(this + 1);
}
protected:
size_t m_size;
};
I previously thought that retrieving the chars via using reinterpret_cast
wouldn't be undefined behaviour as we had previously allocated an object of that type at that address in the constructor, however in the linked paper, on their section explaining the difference between start_lifetime_as
and launder
, they say:
On the other hand, std::launder never creates a new object, but
can only be used to obtain a pointer to an object that already exists at the given memory location,
with its lifetime already started through other means.
The fact that std::launder
is being referred to with this wording makes me think I might have an incorrect model of object lifetime.
As I previously understood it once you allocate an object at a given location in memory, accessing it through reinterpret_cast would always be defined behaviour no matter how or where the pointer being cast was obtained as that memory address does contain an instance of that that type (obviously minding issues like arrays where simply calculating an address outside the array other than the past-the-end pointer is undefined behaviour).
However, given that description of std::launder
in the paper, it makes me think that that model should be amended that any attempt to dereference a pointer obtained via reinterpret_cast
should instead first be passed through launder to prevent undefined behaviour (unless the type is char or similar due to the unique aliasing rules for those types), however this then brings up the question of why reinterpret_cast
would ever be useful for converting pointers if the only thing that could be done with the result would be pointer arithmetic and comparison with pointers of the new type.
So have I just been lucky with compiler optimisations of my code thus far, or am I getting confused by the wording in the paper?
Upvotes: 1
Views: 244
Reputation: 76829
should instead first be passed through launder to prevent undefined behaviour
Yes indeed. In general that is required when actually dereferencing the resulting pointer (some exceptions apply for pointer-interconvertible objects).
he only thing that could be done with the result would be pointer arithmetic
Not even that is allowed. If the type of the expression doesn't match the actual type of the pointed to object (or is similar to it) because you used reinterpret_cast
, then pointer arithmetic is also UB. (See [expr.add]/6.)
reinterpret_cast
on pointers is directly useful to access objects only if you have a pointer to an object and cast it to a pointer type X*
, such that there is an object of type X
in its lifetime which is pointer-interconvertible with the original object. That is the case e.g. for a union object and its active member subobject.
For other cases reinterpret_cast
only changes the expression type without affecting the pointer value and you need to apply std::launder
after the cast in addition to making sure there is an object of type X
at the same memory address in its lifetime (otherwise std::launder
also has UB).
reinterpret_cast
without std::launder
is then only useful to e.g. pass a pointer value with a different type through a function that doesn't touch it to later cast it back to the original type before dereferencing it.
(Also reinterpret_cast
can do more kinds of conversions than pointer-to-pointer.)
However, std::launder
may not save you in your case.
std::launder
has a precondition that using it won't make any more bytes reachable than would have been reachable through the original pointer. Here "reachable" basically means bytes that one could access by pointer arithmetic and reinterpret_cast
between pointer-interconvertible objects. (See [ptr.launder]/4.)
Whether or not bytes after *this
are reachable according to this precondition depends on how exactly memory was obtained and what the structure of objects placed into it are. It is not enough that memory after *this
is allocated. If they bytes are not "reachable" to the definition above, then there is no way to access them at all from this
. The rules are written specifically so that this is impossible, presumably to allow optimizations with respect to the object structure, although I am not aware that any compiler makes use of that. (See e.g. this question.)
(For the same reason the placement-new itself should then also have undefined behavior, but it seems that the specification is inconsistent in that regard at the moment, see my question here.)
If however *this
is e.g. part of an array of WString_header
objects (created explicitly or implicitly) spanning the memory of the placement new'ed wchar_t
array, then there is no problem with the reachability condition and adding std::launder
will be enough to give the code defined behavior. Whether that is the case depends on how exactly memory is obtained and objects placed in it.
Also, you need to make sure that this + 1
is correctly aligned for wchar_t
, otherwise reinterpret_cast
will result in an unspecified value and basically doing anything but copying it will cause UB. It won't even be possible to cast back to the original pointer type/value. Alignment must always be assured. In your case you are not checking that explicitly, but probably you are fine due to the size/alignment of the size_t
member making it fit.
So have I just been lucky with compiler optimisations of my code thus far, or am I getting confused by the wording in the paper?
Of course my answer above is based only on what the standard defines. If a compiler doesn't do aggressive optimizations based on what the standard leaves undefined, then there is no reason that your code won't work (assuming alignment is ok and you allocated enough memory after this
). A compiler can be more permissive.
I am not aware of any compiler making optimizations based on that UB but I cannot make you any guarantees that there aren't cases I am unaware of or that compilers won't start making such optimizations in the future.
Upvotes: 2