Daniel Gilbert
Daniel Gilbert

Reputation: 133

Does it matter if I don't explicitly cast an int to a char before comparison with a char?

When comparing an int to a char, should I explicitly cast one of them, or let the compiler do it for me? Does it matter? Why?

E.g.

int i = 113;
char c = 'q';
if (static_cast<char>(i) == c)
    cout << "equal";
// OR
if (i == c)
    cout << "equal";

Upvotes: 3

Views: 776

Answers (2)

yeputons
yeputons

Reputation: 9238

Actually, it does matter in some cases due to integral promotion and the fact that explicit conversion of int to char truncates higher bytes. Basically, if you try to compare two number-like types of different sizes, there are certain rules by which they're converted to some "common type", which is typically enough to "fit" both types. But if you do explicit conversion, you may lose some information.

Consider the following code:

#include <iostream>

using namespace std;

int main() {
    int i = 113;
    char c = 'q';
    cout << (static_cast<char>(i) == c) << endl; // 1
    cout << (i == c) << endl; // 1

    i += 0x100; // i is now 369, but lower byte is 113.
    cout << (static_cast<char>(i) == c) << endl; // 1
    cout << (i == c) << endl; // 0
    return 0;
}

When you explicitly convert int to a char, higher bytes typically get truncated (as noted in the comments, it's implementation-defined). However, when you compare int with char, the latter gets automatically promoted to int and then two ints are compared.

In this particular case you may avoid explicit cast. However, as mentioned in this answer, it's not always ideologically correct to do so, as in standard library characters stored in int are typically non-negative numbers (e.g. 0..255), as opposed to char, which can be signed depending on the compiler/platform (e.g. -128..127). That will cause potential problems if there are non-ASCII characters in your comparison.

So, if you are absolutely sure that you will never ever need non-ASCII, you can avoid explicit casts. But if it's not the case (or if you want to get a good habit), it's better to think about conventions and convert char to unsigned char before comparing with int:

i == static_cast<unsigned char>(c)

Upvotes: 7

Cheers and hth. - Alf
Cheers and hth. - Alf

Reputation: 145279

The cast in your example should go the other way, from char to int, via unsigned char.


The established very very strong convention for representing char values as int, in particular for functions such as isalpha, is that a negative char value is represented as that value converted to unsigned char, a strictly non-negative value.

And for code adhering to this most common convention, the condition in

if (static_cast<char>(i) == c)

does entirely the wrong thing.

For example, with 8-bit bytes and two's complement signed char type, which is by far the most common, the i value 128 represents the char value -128. The condition above can then incorrectly yields false because you have implementation-defined behavior when the value doesn't fit in the casted-to signed type. The cast has then completely needlessly introduced a bug.

A correct condition instead promotes that char to int, via a detour that converts negative values to strictly non-negative ones:

using Byte = unsigned char;
//...

if( i == static_cast<Byte>( c ) )

In the case where sizeof(int) = 1, this condition still works, even when char is unsigned type.


The standard library's char classification functions have undefined behavior for negative arguments, except the special value EOF.

And so the call

isalpha( c )

will generally have undefined behavior, because usually char is a signed type, and negative values can occur, e.g. with Latin 1 encoding of 'Å'.

It should instead be

using Byte = unsigned char;
//...
isalpha( static_cast<Byte>( c ) )

Upvotes: 2

Related Questions