Reputation: 1876
I am trying to wrap my head around the implicit lifetime and aliasing rules in C++.
The standard says:
Some operations are described as implicitly creating objects within a specified region of storage. For each operation that is specified as implicitly creating objects, that operation implicitly creates and starts the lifetime of zero or more objects of implicit-lifetime types in its specified region of storage if doing so would result in the program having defined behavior.
And:
An operation that begins the lifetime of an array of unsigned char or std::byte implicitly creates objects within the region of storage occupied by the array.
As well as providing an example that states:
// The call to std::malloc implicitly creates an object of type X
Does that mean that the following is legal and correct?
constexpr size_t N = 10;
constexpr size_t S = sizeof(uint32_t);
std::vector<std::byte> buffer;
buffer.resize(N * S);
for (size_t i = 0; i < N * S; i += S)
*reinterpret_cast<uint32_t*>(&buffer.data()[i + 2 * S]) = 42;
uint32_t x;
for (size_t i = 0; i < S; ++i)
*(reinterpret_cast<std::byte*>(&x) + i) = buffer[i];
assert(x == 42);
If not, what am I missing? And is there a way to make it legal using the C++23 subset that LLVM 17 (Clang and libc++) supports?
Note: Even though I tagged the question as "language lawyer", I myself am not one, so I would very much appreciate an explanation in as "simple" terms as possible.
Upvotes: 3
Views: 170
Reputation: 81247
There has never been a consensus understanding as to how some of the rules in the C and C++ Standards are supposed to work--at least not one that is compatible with the way clang and gcc actually process programs.
In the following code, if p
points to a region of storage which can hold either an 8-byte long
or an 8-byte long long
, and if i
, j
, and k
are all zero, the storage at address p
would would never be read using any type other than the one with which it was last written, and any pointer that is used to access storage using any type would be laundered before the next time it is used to write the storage using a new type. If the implicit creation rules are ever supposed to be useful, they should be usable here, where code goes out of its way to make it clear that storage is getting recycled for use as different types.
#include <cstddef>
#include <new>
long test(void *p, int i, int j, int k)
{
long long temp;
long* ul1 = std::launder(reinterpret_cast<long*>(p));
ul1[i] = 1234;
// Contents of storage at ul1[i] will never again be used
long long* ull2 = std::launder(reinterpret_cast<long long*>(ul1));
ull2[j] = 2345;
temp = ull2[k];
// Contents of storage at ull2[j] will never again be used
long* ul4 = std::launder(reinterpret_cast<long*>(ull2));
ul4[k] = 3456;
ul4[k] = temp;
return ul4[i];
}
Unfortunately, neither clang nor gcc will correctly process this code unless invoked with -fno-strict-aliasing
, and in that mode they relax the type-based access constraints in the C and C++ Standard, rendering them moot.
Upvotes: -1
Reputation: 39869
*reinterpret_cast<uint32_t*>(&buffer.data()[i + 2 * S]) = 42;
This is undefined behavior because you forgot std::launder
.
Without std::launder
, you are accessing a std::byte
through a glvalue of type uint32_t
, which is UB because std::byte
is not type-accessible through uin32_t
.
It doesn't matter whether a uint32_t
exists in those bytes; reinterpret_cast
without laundering doesn't give you a pointer to it.
As for whether implicit object creation takes place there:
std::vector<std::byte>
has to maintain an array of bytes internally ([vector.data] implies this), but not necessarily one where implicit objects for you are created.
If std::vector
simply allocated some bytes (with std::allocator
, operator new
, this is guaranteed) and gave you a pointer, then you could obviously use the implicitly created objects there, but it might do a lot more.
For example, it could do placement-new for each individual byte when setting them to zero, which would end the lifetime of any implicit uint32_t
in the same place.
You're at best relying on implementation details of std::vector
with this code.
If not, what am I missing? And is there a way to make it legal using the C++23 subset that LLVM 17 (Clang and libc++) supports?
Yes, but ideally, don't use std::vector
if you need storage for implicitly created objects.
Use something like std::unique_ptr<std::byte[]>
:
// obtain uninitialized, dynamically allocated byte[]
// objects are implicitly created inside (see [intro.object])
std::unique_ptr<std::byte[]> buffer
= std::make_unique_for_overwrite<std::byte[]>(N * S);
for (size_t i = 0; i < N * S; i += S) {
// obtain a pointer to the byte where the uin32_t is stored
std::byte* byte = buffer.get() + i * S;
// obtain a pointer to the uint32_t
uint32_t* uint = std::launder(reinterpret_cast<uint32_t*>(byte));
// overwrite its value with 42
*uint = 42;
}
This code is still highly questionable because you could have just allocated a uint32_t[]
in the first place.
If all objects in your buffer
have the same type, you could also simplify this code by doing:
uint32_t* integers = std::launder(reinterpret_cast<uint32_t*>(buffer.get()));
for (size_t i = 0; i < N; ++i) { // or use std::fill
integers[i] = 42;
}
Both the std::vector
case and the std::unique_ptr
case are "fine in practice" in terms of alignment.
[basic.stc.dynamic.allocation] explains that for operator new
:
the storage is aligned for any object that does not have new-extended alignment
i.e. you get the minimum guaranteed alignment of __STDCPP_DEFAULT_NEW_ALIGNMENT__
,
and this is going to be at least alignof(void*)
and maybe alignof(max_align_t)
in any sane implementation.
Upvotes: 2