Reputation: 70492
Is the following program a strictly conforming program in C? I am interested in c90 and c99 but c11 answers are also acceptable.
#include <stdio.h>
#include <string.h>
struct S { int array[2]; };
int main () {
struct S a = { { 1, 2 } };
struct S b;
b = a;
if (memcmp(b.array, a.array, sizeof(b.array)) == 0) {
puts("ok");
}
return 0;
}
In comments to my answer in a different question, Eric Postpischil insists that the program output will change depending on the platform, primarily due to the possibility of uninitialized padding bits. I thought the struct assignment would overwrite all bits in b
to be the same as in a
. But, C99 does not seem to offer such a guarantee. From Section 6.5.16.1 p2:
In simple assignment (
=
), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.
What is meant by "converted" and "replaces" in the context of compound types?
Finally, consider the same program, except that the definitions of a
and b
are made global. Would that program be a strictly conforming program?
Edit: Just wanted to summarize some of the discussion material here, and not add my own answer, since I don't really have one of my own creation.
b.array
may or may not contain bits set differently from a.array
.a
doesn't need to be converted since it is the same type as b
, but the replacement is by value, and done member by member.a
and b
are made global, post assignment, b.array
may or may not contain bits set differently from a.array
. (There was little discussion about the padding bytes in b
, but the posted question was not about structure comparison. c99 lacks a mention of how padding is initialized in static storage, but c11 explicitly states it is zero initialized.)memcmp
is well defined if b
was initialized with memcpy
from a
.My thanks to all involved in the discussion.
Upvotes: 26
Views: 4403
Reputation: 4378
My opinion is that it is strictly conforming. According to 4.5 that Eric Postpischil mentioned:
A strictly conforming program shall use only those features of the language and library specified in this International Standard. It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.
The behavior in question is the behavior of memcmp
, and this is well-defined, without any unspecified, undefined or implementation-defined aspects. It works on the raw bits of the representation, without knowing anything about the values, padding bits or trap representations. Thus the result (but not the functionality) of memcmp
in this specific case depends on the implementation of the values stored within these bytes.
Footnote 43) in 6.2.6.2:
It is possible for objects x and y with the same effective type T to have the same value when they are accessed as objects of type T, but to have different values in other contexts. In particular, if == is defined for type T, then x == y does not imply that memcmp(&x, &y, sizeof (T)) == 0. Furthermore, x == y does not necessarily imply that x and y have the same value; other operations on values of type T may distinguish between them.
EDIT:
Thinking it a bit further, I'm not so sure about the strictly conforming anymore because of this:
It shall not produce output dependent on any unspecified [...]
Clearly the result of memcmp
depends on the unspecified behavior of the representation, thereby fulfilling this clause, even though the behavior of memcmp
itself is well defined. The clause doesn't say anything about the depth of functionality until the output happens.
So it is not strictly conforming.
EDIT 2:
I'm not so sure that it will become strictly conforming when memcpy
is used to copy the struct. According to Annex J, the unspecified behavior happens when a
is initialized:
struct S a = { { 1, 2 } };
Even if we assume that the padding bits won't change and memcpy
always returns 0, it still uses the padding bits to obtain its result. And it relies on the assumption that they won't change, but there is no guarantee in the standard about this.
We should differentiate between paddings bytes in structs, used for alignment, and padding bits in specific native types like int
. While we can safely assume that the padding bytes won't change, but only because there is no real reason for it, the same does not apply for the padding bits. The standard mentions a parity flag as an example of a padding bit. This may be a software function of the implementation, but it may as well be a hardware function. Thus there may be other hardware flags used for the padding bits, including one that changes on read accesses for whatever reason.
We will have difficulties in finding such an exotic machine and implementation, but I see nothing that forbid this. Correct me if I'm wrong.
Upvotes: 0
Reputation: 36071
In C99 §6.2.6
§6.2.6.1 General
1 The representations of all types are unspecified except as stated in this subclause.
[...]
4 [..] Two values (other than NaNs) with the same object representation compare equal, but values that compare equal may have different object representations.
6 When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.42)
42) Thus, for example, structure assignment need not copy any padding bits.
43) It is possible for objects x and y with the same effective type T to have the same value when they are accessed as objects of type T, but to have different values in other contexts. In particular, if == is defined for type T, then x == y does not imply that memcmp(&x, &y, sizeof (T)) == 0. Furthermore, x == y does not necessarily imply that x and y have the same value; other operations on values of type T may distinguish between them.
§6.2.6.2 Integer Types
[...]
2 For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits;[...]
[...]
5 The values of any padding bits are unspecified.[...]
In J.1 Unspecified Behavior
- The value of padding bytes when storing values in structures or unions (6.2.6.1).
[...]
- The values of any padding bits in integer representations (6.2.6.2).
Therefore there may be bits in the representation of a
and b
that differ while not affecting the value. This is the same conclusion as the other answer, but I thought that these quotes from the standard would be good additional context.
If you do a memcpy
then the memcmp
would always return 0 and the program would be strictly conforming. The memcpy
duplicates the object representation of a
into b
.
Upvotes: 4