Reputation: 77
I'm trying to interpret the C11 standard regarding static (and thread-local) initialisation of a union when not explicitly initialised.
Section 6.7.9 10 (pg 139) states the following:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
Supposing we're on an amd64 architecture, given the following statement:
static union { uint32_t x; uint16_t y[3]; } u;
Can u.y[2]
contain non-zero values or is it initialised to zero because it is regarded as padding?
I've scoured the C11 standard but there is little to no explanation as to what constitutes as padding in a union. In the C99 standard (pg 126) padding isn't mentioned, so in that case u.y[2]
can be non-zero.
Upvotes: 6
Views: 1090
Reputation: 12679
Can u.y[2] contain non-zero values or is it initialised to zero because it is regarded as padding?
u.y[2]
is not regarded as padding. It is an element of array y
which is a member of union u
.
The union is only as big as necessary to hold its largest member (additional unnamed trailing padding may also be added for the purpose of alignment).
From C Standard#6.7.2.1p17
17 There may be unnamed padding at the end of a structure or union.
The largest member of union u
is uint16_t y[3];
. So, if there is any padding in union u
then it will be after uint16_t y[3];
member 1).
As per the C11 Standard, the union object that has static or thread storage duration and is not initialized explicitly the compiler should initialize the first named member (recursively) and any padding to zero bits. Hence, you should not make any assumption about u.y[2]
value because the compiler will only initialize first named member of union2), which is uint32_t x
in your example, and any padding to zero bits (#6.7.9p10).
C Standard do not mention anything about the Data segment (initialized/uninitialized), Stack, Heap etc. These all are architecture/platform specific. For the object initialization, the C Standards only specify what to be initialize to 0
and what not and it does not specify which storage duration object go in which segment. The Standard specification are for compilers and a good compiler is expected to follow them. Typically, the 0
initialized static data goes in .BSS (Block Started by Symbol), non 0
initialized data goes in .DATA(Data Segment). So, you may find u.y[2]
value 0
but that may not always be the case.
1) Every modern compiler will automatically use data structure padding depending on architecture. Some compilers even support the warning flag -Wpadded
which generates helpful warnings about structure padding. These warnings help the programmer take manual care in case a more efficient data structure layout is desired.
-Wpadded
Warn if padding is included in a structure, either to align an element of the structure or to align the whole structure. Sometimes when this happens it is possible to rearrange the fields of the structure to reduce the padding and so make the structure smaller.
So, if your compiler supports warning flag -Wpadded
, try compiling your code with it. That will help you in understanding the padding included by the compiler.
For e.g.
#include <inttypes.h>
int main() {
static union { uint32_t x; uint16_t y[3]; } u;
}
Lets compile this with -Wpadded
option. My compiler is clang
version clang-1000.10.44.4
# clang -Wpadded p.c
p.c:4:16: warning: padding size of 'union (anonymous at p.c:4:16)' with 2 bytes to alignment boundary [-Wpadded]
static union { uint32_t x; uint16_t y[3]; } u;
^
1 warning generated.
2) A point to note - if you explicitly initialize an union object, unless its a designated initialization, then also the first member of union will be initialized (C11 Standard#6.7.9p17).
Upvotes: 0
Reputation: 224892
The extra space used by y
that isn't used by x
is not considered padding. Section 6.7.2.1p17 of the C11 standard regarding "Structure and union specifiers" states:
There may be unnamed padding at the end of a structure or union
The bytes used by y
in your example that are not used by x
are still named, and are therefore not padding.
Your example most likely does have this unnamed padding, since the largest member takes up 6 bytes but one of the members is a uint32_t
which typically requires 4 byte alignment. In fact, on gcc 4.8.5 the size of this union is 8 bytes. So the memory layout of this union looks like this:
----- --| ---|
0 | 0 | | |
----- | |-- y[0]
1 | 0 | | |
----- |-- x ---|
2 | 0 | | |
----- | |-- y[1]
3 | 0 | | |
----- --| ---|
4 | 0 | |
----- |-- y[2]
5 | 0 | |
----- ---|
6 | 0 | -- padding
-----
7 | 0 | -- padding
-----
So going by a strict reading of the standard, for a static instance of this union without an explicit initializer:
x
(i.e. the first named member), are initialized to 0 resulting in x
being 0.I tested this on gcc 4.8.5, clang 3.3, and MSVC 2015, and all of them set all bytes to 0 under various optimization settings. However, going by a strict reading of the standard the behavior is not guaranteed, so it's still possible that a different optimization setting of these compilers, different versions of them, or different compilers altogether may do something different.
From a pragmatic standpoint, it would make sense for a compiler to simply set all bytes of a static object to 0 to satisfy this requirement. This is assuming of course that no integer types have padding, floating point types are IEEE754, and NULL pointers have the numerical value of 0. On most systems that most people are likely to come across, this will be the case. Systems where this is not the case might be more likely to leave these bytes set to something other than 0. So again, while these bytes might be set to 0, there is no guarantee.
An important point to keep in mind is that a union can only store one member at a time as per 6.7.2.1p16:
The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit- field, then to the unit in which it resides), and vice versa.
So if a union
with static storage duration is uninitialized, it is only safe to access the first member since that is the one which was implicitly initialized.
The only exception to this is if the union contains structures with a common set of initial members, in which case you can access any of the common elements of the inner structs. This is detailed in section 6.5.2.3p6:
One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
Upvotes: 5
Reputation: 67835
If the storage is automatic it may contain any value as it is not initialized. If the storage is static it will initialized to zeroes.
Padding does not affect your union as it something which not belongs to any members of the structure or union.
For example if in your implementation data is padded to the 8 bytes boundary no padding will be added at all. There will be a 2 bytes gap between this union and the next object.
Upvotes: -1