Why do we need to declare a variable of union type when nested in a structure in C?

Question

I have a code sample from a tutorial, which says

struct goods {
    char name[20];
    union quantity {
        int count;
        float weight, volume;
    } q;
};

I cant figure out why do we need to declare 'q' variable along with a union type name 'quantity'? Why can't we get away with just 'quantity' and then access struct fields via dot?

Update: Is it correct that 'quantity' is a name/tag of a type union, while 'q' is not a variable but rather a name of a union member/field which contains sub-members (count, weight, volume)?

Eric Postpischil · Accepted Answer

It is not clear what specific question you are asking about this code, so let’s review the issues.

Members of Structures

When a union declaration appears inside a struct declaration, it is usually declaring a member of that structure that is a union. This member is a part of the structure the same as any other member, such as an int x declared in the structure. Every instance of the structure contains an instance of each of its members, including the union—the union is part of the structure, not a separate thing.

Names

In this code:

    union quantity {
        int count;
        float weight, volume;
    } q;

the identifier quantity is a tag for the union. In this role, it must appear after a union keyword, always as union quantity. It only names the union type; it does not name any union object or member of a structure. (The same identifier can be used in multiple roles. We could also add a declaration that defined quantity to be a type or an object or member, and then it would have two roles: It could be used as union quantity to refer to the union type, and it could be used by itself to refer to whatever the other declaration declared.)

In the same code above, q is the name of a member of a structure. It is the name for that union quantity object that is in each instance of the struct goods.

With this declaration, if we define a struct goods G;, then G.q refers to the union quantity that is in G, and G.q.count, G.q.weight, and G.q.volume refer to the members in the union G.q. (Only one of those members can be stored at a time, because they all overlap in a union.)

Anonymous Unions

In C 2011, a new feature was added. A union or structure could be declared inside another union or structure without a member name:

struct goods {
    char name[20];
    union {
        int count;
        float weight, volume;
    };
};

This does not change the layout of the structure at all—it still has the same members. However, their names are different. Given a struct goods G, we can refer to the count member as G.count instead of G.q.count, and similarly for weight and volume. (Note that, in addition to removing the member name q, this code also removed the tag quantity. There is a rule in the C standard that says that for a structure or union to be anonymous, it must not have a tag as well as not having a member name. I do not see a technical reason for this. Perhaps it was a choice to avoid errors where member names are inadvertently left out.)

As to why somebody might give a union member a name rather than make it anonymous, one reason is the code was written prior to 2011, or after 2011 but to be used in C implementations that did not yet support anonymous members. Another reason is that they wanted to distinguish the union members so anybody reading or writing the code would be alert to the fact that these members were inside something inside the structure, not regular direct structure members.

Why do we need to declare a variable of union type when nested in a structure in C?

Answers (2)

Members of Structures

Names

Anonymous Unions

Related Questions