Reputation: 481
I am a bit confused about the exact meaning of operator ==
in C.
Does it compare mathematical values variables represent (depending on their types) or the bit patterns behind the variables? Specifically:
int x = 0x80000000;
unsigned y = x;
x==y // true
So despite the fact that x is a large negative value and y is a large positive they are equal (I guess because they have the same bit pattern).
int64_t x = 0x8000000000000000;
int y = x;
x==y // false
Here it does not matter that the first (least significant) 32 bits in x and y are the same. So it looks like in this case C looks at the values represented by the variables.
What are the official rules and is there an authoritative reference for this (have not found anything useful on this in K&R)?
I have used gcc
compiler for the above examples.
Upvotes: 3
Views: 142
Reputation: 123578
The exact meaning per the language definition:
6.2.6 Representations of typesC 2011 Online Draft
6.2.6.1 General
4 Values stored in non-bit-field objects of any other object type consist of n ×CHAR_BIT
bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of typeunsigned char [n]
(e.g., bymemcpy
); the resulting set of bytes is called the object representation of the value. Values stored in bit-fields consist of m bits, where m is the size specified for the bit-field. The object representation is the set of m bits the bit-field comprises in the addressable storage unit holding it. Two values (other than NaNs) with the same object representation compare equal, but values that compare equal may have different object representations.
...
6.5.9 Equality operators
...
4 If both of the operands have arithmetic type, the usual arithmetic conversions are performed. Values of complex types are equal if and only if both their real parts are equal and also their imaginary parts are equal. Any two values of arithmetic types from different type domains are equal if and only if the results of their conversions to the (complex) result type determined by the usual arithmetic conversions are equal.
5 Otherwise, at least one operand is a pointer. If one operand is a pointer and the other is a null pointer constant, the null pointer constant is converted to the type of the pointer. If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.
6 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.109)
109) Two objects may be adjacent in memory because they are adjacent elements of a larger array or adjacent members of a structure with no padding between them, or because the implementation chose to place them so, even though they are unrelated. If prior invalid pointer operations (such as accesses outside array bounds) produced undefined behavior, subsequent comparisons also produce undefined behavior.
Semantically speaking, the ==
operator is comparing values, not bits - 1.0 == 1
will evaluate to true
even though both operands have completely different bitwise representations.
However, as part of the comparison, the integer 1
will first be converted to the floating-point 1.0
so that a bitwise comparison can be made.
Upvotes: 2
Reputation: 224102
You have jumped to a conclusion in asking about whether the comparison is based on values or bit patterns, because there is an important step first. Before comparison, the operands of ==
are converted to a common type.
As an example, when you compare a 32-bit two’s complement int x
with the bit pattern 1000…00002 (representing −2,147,483,648) and an unsigned int y
with the same bit pattern (representing +2,147,483,648) with x == y
, the x
is first converted to unsigned int
, which produces +2,147,483,648. Then +2,147,483,648 is compared to +2,147,483,648, so ==
reports they are equal.
C 2018 6.5.9 (“Equality operators”) 4 says:
If both of the operands have arithmetic type, the usual arithmetic conversions are performed…
The usual arithmetic conversions are specified in 6.3.1.8. Paragraph 1 starts:
Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted… to a type whose corresponding real type is the common real type.
The rules involve some technical details, but, in large part, when you compare two integer types, first each will be promoted to at least int
, and then the narrower type will be converted to the wider type. If they are the same width but one is unsigned, the signed type will be converted to the unsigned type. This may change the value.
Once the actual values to be compared are determined, the result of ==
is defined in terms of the values, not the bit pattern.
(The most common situation where these differ is with floating-point +0 and −0, which represent the same real number and compare equal but have different representations. In most modern environments, all bit patterns in an integer type represent different values, and all bit patterns in a binary floating-point type represent either different values or NaNs except for +0 and −0. There are some less commonly used floating-point types that have multiple representations for some values, analogous to the way 3.5•107 and 35•106 represent the same number.)
Anytime you compare a negative value in a signed integer type to an unsigned type that is the same width or wider (after the promotions), the value of the signed type will be changed before the comparison. So you have a risk of getting a “mathematically wrong” result.
Upvotes: 7
Reputation: 225227
First, let's look at the initialization of x
. Assuming a 32-bit int
, the constant 0x80000000
has type unsigned int
with value 231. So this constant must be converted to type int
. Section 6.3.1.3p3 of the C standard dictates how this happens:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
So an implementation-defined conversion happens. On a two's complement system, this is typically implemented by assigning the low 32 bits of the value in question's representation directly to the object to be assigned to. This results in x
having the value -231.
Now x
is assigned to y
. This means the value is converted from int
to unsigned int
, and the value in question is negative. So the conversion is dictated by section 6.3.1.3p2:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type
The maximum value for unsigned int
(assuming 32 bit) is 232-1, so one more than this is 232. Adding 232 to -231 gives us 231 which is what is stored in y
.
Now to the comparison.
When two different arithmetic types are compared via the ==
operator, they undergo the usual arithmetic conversions.
Section 6.3.1.8p1 of the C standard states the following regarding how two integer types are converted:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned
integer type corresponding to the type of the operand with signed integer type.
The paragraph in bold is the one that applies here since we're comparing an int
with an unsigned int
. So the value of x
is converted to the type unsigned int
.
Going back to the conversion rules of section 6.3.1.3p2, the maximum value for unsigned int
(assuming 32 bit) is 232-1, so one more than this is 232. Adding 232 to the value of x
, i.e. -231, gives us 231. This is the same as the value of y
, so the comparison is true.
In the second example where x
has type int64_t
and y
has type int
, the implementation-defined conversion from int64_t
to int
when x
is assigned to y
likely results in y
being 0 since the lower 32 bits of x
are all 0.
Upvotes: 3