Reputation: 133
When comparing an int to a char, should I explicitly cast one of them, or let the compiler do it for me? Does it matter? Why?
E.g.
int i = 113;
char c = 'q';
if (static_cast<char>(i) == c)
cout << "equal";
// OR
if (i == c)
cout << "equal";
Upvotes: 3
Views: 776
Reputation: 9238
Actually, it does matter in some cases due to integral promotion and the fact that explicit conversion of int
to char
truncates higher bytes.
Basically, if you try to compare two number-like types of different sizes, there are certain rules by which they're converted to some "common type", which is typically enough to "fit" both types. But if you do explicit conversion, you may lose some information.
Consider the following code:
#include <iostream>
using namespace std;
int main() {
int i = 113;
char c = 'q';
cout << (static_cast<char>(i) == c) << endl; // 1
cout << (i == c) << endl; // 1
i += 0x100; // i is now 369, but lower byte is 113.
cout << (static_cast<char>(i) == c) << endl; // 1
cout << (i == c) << endl; // 0
return 0;
}
When you explicitly convert int
to a char
, higher bytes typically get truncated (as noted in the comments, it's implementation-defined). However, when you compare int
with char
, the latter gets automatically promoted to int
and then two ints
are compared.
In this particular case you may avoid explicit cast. However, as mentioned in this answer, it's not always ideologically correct to do so, as in standard library characters stored in int
are typically non-negative numbers (e.g. 0..255
), as opposed to char
, which can be signed depending on the compiler/platform (e.g. -128..127
). That will cause potential problems if there are non-ASCII characters in your comparison.
So, if you are absolutely sure that you will never ever need non-ASCII, you can avoid explicit casts. But if it's not the case (or if you want to get a good habit), it's better to think about conventions and convert char
to unsigned char
before comparing with int
:
i == static_cast<unsigned char>(c)
Upvotes: 7
Reputation: 145279
The cast in your example should go the other way, from char
to int
, via unsigned char
.
The established very very strong convention for representing char
values as int
, in particular for functions such as isalpha
, is that a negative char
value is represented as that value converted to unsigned char
, a strictly non-negative value.
And for code adhering to this most common convention, the condition in
if (static_cast<char>(i) == c)
does entirely the wrong thing.
For example, with 8-bit bytes and two's complement signed char
type, which is by far the most common, the i
value 128 represents the char
value -128. The condition above can then incorrectly yields false
because you have implementation-defined behavior when the value doesn't fit in the casted-to signed type. The cast has then completely needlessly introduced a bug.
A correct condition instead promotes that char
to int
, via a detour that converts negative values to strictly non-negative ones:
using Byte = unsigned char;
//...
if( i == static_cast<Byte>( c ) )
In the case where sizeof(int)
= 1, this condition still works, even when char
is unsigned type.
The standard library's char
classification functions have undefined behavior for negative arguments, except the special value EOF
.
And so the call
isalpha( c )
will generally have undefined behavior, because usually char
is a signed type, and negative values can occur, e.g. with Latin 1 encoding of 'Å'
.
It should instead be
using Byte = unsigned char;
//...
isalpha( static_cast<Byte>( c ) )
Upvotes: 2