Casting, structs and common initial sequence

Given:

struct A {
    int i;
    char c;
    double d;

    int x;
    char y;
    double z;
};

struct B {
    int i;
    char c;
    double d;

    int x;
    long y;
    double z;
};

struct A *a = ...

as I understand it, this is okay because the field being accessed is part of the common initial sequence of the two structs:

((struct B *)a)->d = 1.2;

but this is not, because the differing y fields make z not part of the common initial sequence:

((struct B *)a)->z = 1.2;

Is this correct?

Upvotes: 2

Answers (3)

supercat

Reputation: 81347

The way N1570 6.5p7 is written makes no allowance for any operation that would access a struct or union using an lvalue of non-character member type (it does allow a member of a struct or union to be accessed using an arbitrary lvalue of an enclosing struct or union type, but not the reverse). Thus, given something like:

struct blob { int size; int *dat; };

void clear_blob(blob *p)
{
  for (int i=0; i < p->size; i++)
    p->dat[i] = 0;
}

the Standard avoids (IMHO deliberately, though I don't think all members of the Committee noticed this) requiring that a compiler allow for the possibility that p->dat might hold the address of p->size while p->size holds a value of 2 or greater. In that case, the store would cause size to hold zero, which would in turn cause the loop to exit after its first iteration rather than running for size iterations.

Of course, the language would be largely useless if implementations could not be relied upon to allow the stored values of structures and unions to be accessed using lvalues of member types in at least some circumstances, but the Committee left the question of exactly what those circumstances should be as a quality-of-implementation issue, outside the Standard's jurisdiction, expecting that people designing compilers for kinds of customers (number-crunchers, systems programmers, etc.) would no more than the committee about what their particular customers would need, and saw no need to mandate that compilers support constructs that compiler writers would have had to be deliberately obtuse to ignore.

Neither clang nor gcc will reliably support code which takes the address of a union member and uses the pointer to access an object of that type except when using -fno-strict-aliasing mode. Even given something like:

struct s1 {int x;};
struct s2 {int x;};
union { struct s1 v1[10]; struct s2 v2[10];} uarr;
int test(int i, int j)
{
    if ((uarr.v1+i)->x)
        (uarr.v2+j)->x = 0;
    return (uarr.v1+i)->x;
}

they won't recognize that the access to (uarr.v2+j)->x might interact with (uarr.v1+i)->x. The ability to handle such code, however, is not required for conformance but is merely a quality-of-implementation issue. While present versions of clang and gcc happen to support similar code using the [] operator, the Standard does not mandate the behavior of any expression of the form E1[E2] operator in any cases where it would not be equivalent to (*((E1)+(E2)))--a form which clang and gcc do not support.

Upvotes: 2

th33lf

Reputation: 2275

From what I understand, it is currently undefined behaviour on all platforms, as per the strict aliasing rules. If you wrap both the structs in a common union, your first example might work as intended. However, your second example becomes implementation dependent - it might work on platforms that enforce an alignment equal to or greater than the size of long, but not on others.

union C 
{
    struct A a;
    struct B b;
};

C *c;
c->a.d = 1.2; // Valid
c->b.d = 1.5; // Valid and sets the value of a.d as well
c->b.z = 1.2; // Valid but may not set the value of a.z as expected

Upvotes: 2

Sneftel

Reputation: 41542

From the standard, 6.5.2.3 p5:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-ﬁelds, the same widths) for a sequence of one or more initial members.

So your first snippet is almost okay. The C standard only allows you to use the common initial sequence if a union involving them is visible at that point. Thus, adding union Foo{ struct A; struct B; }; after the declarations of A and B should make your first snippet well-defined, even if the structs you're examining aren't actually stored in such a union. (Presumably this stricture is important because it allows the compiler to make more daring optimizations for structures you don't intend to alias.)

As th33lf said, your second snippet is UB, period.

Upvotes: 4

Casting, structs and common initial sequence

Answers (3)

Related Questions