Serge Ballesta
Serge Ballesta

Reputation: 148965

Is it legal to alias a struct and an array?

Pointer arithmetics between consecutive members of same type in a struct used to be a common practice while pointer arithmetics is only valid inside an array. In C++ it would be explicitely Undefined Behaviour because an array can only be created by a declaration or a new expression. But C language defines an array as a contiguously allocated nonempty set of objects with a particular member object type, called the element type. (n1570 draft for C11, 6.2.5 types §20). So provided we can make sure that that the members are consecutive (meaning no padding between them) it could be legal to see that as an array.

Here is a simplified example, that compiles without a warning and gives expected results at run time:

#include <stdio.h>
#include <stddef.h>
#include <assert.h>

struct quad {
    int x;
    int y;
    int z;
    int t;
};

int main() {
    // ensure members are consecutive (note 1)
    static_assert(offsetof(struct quad, t) == 3 * sizeof(int),
        "unexpected padding in quad struct");
    struct quad q;
    int *ix = &q.x;
    for(int i=0; i<4; i++) {
        ix[i] = i;
    }
    printf("Quad: %d %d %d %d\n", q.x, q.y, q.z, q.t);
    return 0;
}

It does not really make sense here, but I have already seen real world example where iterating among members of a struct allows simpler code with less risk of typo.

Question:

In the above example, is the static_assert enough to make legal the aliasing of the struct with an array?


(note 1) As a struct describes a sequentially allocated nonempty set of member objects, later members must have increasing addresses. Simply the compiler could include padding between them. So the offset of last member (here t) if 3 times sizeof(int) plus the total padding before it. If the offset is exactly 3 * sizeof(int) then there is no padding in struct


The question proposed as a duplicate contains both an accepted answer that let think that it would be UB, and a +1 answer that let think that it could be legal because I could ensure that no padding could exist

Upvotes: 5

Views: 1302

Answers (5)

Serge Ballesta
Serge Ballesta

Reputation: 148965

It would be UB. As established in that other question, the static_assert can test for possible padding in a conformant way. So yes the 4 members of the struct are indeed consecutively allocated.

But the real problem is that consecutive allocation is necessary but not enough to constitute an array. Even if I could not find a clear reference for it in C standard, objects cannot overlap during their lifetime - this is more clearly explicited in C++ standard. They can be members of an aggregate (struct or array) but aggregates are not allowed to overlap. This is coherent with the response to Defect Report #017 dated 10 Dec 1992 to C89 cited by Antti Haapala in its answer to the proposed duplicate.

Even if C has no new statement, allocated storage has has the particular property of having no declared type. That allows to create dynamically objects in that storage, but the lifetime of an allocated object ends when an object of a different type is created at its address. So even in allocated memory we cannot have at the same time both an array and a struct.

According to Lundin's answer, type punning through an union between an array and a struct should work, because a (non normative) note says

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type

and both type will have same representation: 4 consecutive integers

Without unions, an way to iterate through members of an array would be at the byte level because 6.3.2.3 Conversions/Pointers says:

7 ... When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

char *p = q;
for (i=0; i<4; i++) {
    int *ix = (int *) (p + i * sizeof(int));  // Ok: points to the expected int member
    *ix = i;
}

But pointer arithmetics on non character types to iterate over members of a struct is UB simply because individual members of a struct cannot be at the same time members of an array.

Upvotes: 1

Ajay Brahmakshatriya
Ajay Brahmakshatriya

Reputation: 9203

To start with -

Quoting C11, chapter §6.5.2.1p2

A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). ...

Which means ix[i] evaluates to *(ix + i). A subexpression here is ix + i. ix has type pointer to integer.

Now,

Quoting C11, chapter §6.5.6p7

For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

We know thus that ix is pointing to an array of size one. And even constructing a pointer to beyond the length (except the off by one) is Undefined Behavior, let alone dereferencing it.

Which leads me to interpret that is indeed not allowed.

Upvotes: 1

Lundin
Lundin

Reputation: 213940

No, it isn't legal to alias a struct and array like this, it violates strict aliasing. The work-around is to wrap the struct in a union, which contains both an array and the individual members:

union something {
  struct quad {
    int x;
    int y;
    int z;
    int t;
  };

  int array [4];
};

This dodges the strict aliasing violation, but you may still have padding bytes. Which you can detect with the static assert.

Another issue remains, and that is that you can't use pointer arithmetic on an int* pointing at the first member of the struct, for various obscure reasons outlined in the specified behavior of the additive operators - they require that the pointer points at an array type.

The best way to dodge all of this is to simply use the array member of the union above. This together with a static assert results in well-defined, rugged and portable code.

(In theory, you could also use a pointer to character type to iterate through the struct - unlike int* this would be allowed as per 6.3.2.3/7. But this is a more messy solution if you have no interest in the individual bytes.)

Upvotes: 4

I'm gonna argue UB. First and foremost, the mandatory quote from 6.5.6 Additive operators:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i-n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

I emphasized what I consider the crux of the matter. You are right when you say that an array object is "a contiguously allocated nonempty set of objects with a particular member object type, called the element type". But is the converse true? Does a consecutively allocated set of objects constitute an array object?

I'm going to say no. Objects need to be explicitly created.

So for your example, there is no array object. There are generally two ways to create objects in C. Declare them with automatic, static or thread local duration. Or allocate them and give the storage an effective type. You did neither to create an array. That makes the arithmetic officially undefined.

Upvotes: 1

Bathsheba
Bathsheba

Reputation: 234715

The problem here is your definition of contiguously allocated: "we can make sure that that the members are consecutive (meaning no padding between them)".

Although that is a corollary of being contiguously allocated, it does not define the property.

Your structure members are separate variables with automatic storage duration, in a particular order with or without padding depending on how you are able to control your compiler, that's all. As such you can't use pointer arithmetic to reach one member given the address of another, and the behaviour on doing so is undefined.

Upvotes: 2

Related Questions