nlsbgr
nlsbgr

Reputation: 11

Pointer-arithmetic on contiguous non-array objects

std::vector is generally considered to be unimplementable pre-C++20 (as discussed in P0593), as the elements cannot be placed in an internal array while adhering to the required performance restrictions, and pointer-arithmetic on returned pointers and in particular the pointer returned by .data() not being allowed:

char * storage = new char[sizeof(int) * 3];
int * data = new(storage) int(1);
new(storage + sizeof(int)) int(2);
new(storage + sizeof(int) * 2) int(3);

data[2] = 5; // undefined behavior, as there is no array of int at the address data points to

However, the pre-C++17 standard contains these quotes:

[basic.compound] If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.

[expr.add] For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

[expr.add] Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object.

This should make the following code legal and conforming:

char * storage = new char[sizeof(int) * 3];
int * data = new(storage) int(1);
new(storage + sizeof(int)) int(2);
new(storage + sizeof(int) * 2) int(3);

int * tmp = data + 1; // legal due to [expr.add]
                      // tmp now points to the int placed at storage + sizeof(int) due to [basic.compound]
                      // requires laundering as of C++17
tmp = tmp + 1;        // tmp now points to the int placed at storage + sizeof(int) * 2
*tmp = 5;

(data + 1)[1] = 5;    // equivalent to the above

In other words, contiguously stored objects of the same type can reach each other so long as the pointers used to do so are only repeatedly moved step-wise to their immediate neighbour instead of using direct pointer arithmetic. Am I reading the standard correctly?

The C++17 standard changes the formulation of [basic.compound] and [expr.add] and requires laundering the pointer after each stepwise movement in order to actually point to the int object at that location, but should otherwise work equivalently.


Edited to remove the use of reinterpret_cast, as the legality of its use is not the focus of this question.

Upvotes: 1

Views: 415

Answers (1)

Goswin von Brederlow
Goswin von Brederlow

Reputation: 12322

I don't think that code is perfectly legal even with c++20.

https://en.cppreference.com/w/cpp/language/reinterpret_cast

Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true

  • AliasedType and DynamicType are similar.
  • AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType.
  • AliasedType is std::byte, (since C++17) char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.

In

int * data = reinterpret_cast<int *>(storage);

the DynamicType is char and the AliasedType is int which fits none of the criteria. Even if the pointer arithmetic is allowed the *tmp = 5; still has to follow the type aliasing rules of the initial pointer.

To access the int you create via placement new you have to capture it's pointer:

int * p = new(storage) int(1);

But that isn't an array so you can't do pointer arithmetic (<= C++17) on it to get to the second ìnt`.

PS: Note that a char * with the value of the address of storage and an int * with the value of the address of storage may not be represented by the same bit pattern in the hardware.

Upvotes: 2

Related Questions