Hesam Qodsi
Hesam Qodsi

Reputation: 1585

Why is (int)'\xff' != 0xff but (int)'\x7f' == 0x7f?

Consider this code :

typedef union
{
    int integer_;
    char mem_[4];
} MemoryView;

int main()
{
    MemoryView mv;
    mv.integer_ = (int)'\xff';
    for(int i=0;i<4;i++)
        std::cout << mv.mem_[i]; // output is \xff\xff\xff\xff

    mv.integer_ = 0xff;
    for(int i=0;i<4;i++)
        std::cout << mv.mem_[i]; // output is \xff\x00\x00\x00

    // now i try with a value less than 0x80
    mv.integer_ = (int)'\x7f'
    for(int i=0;i<4;i++)
        std::cout << mv.mem_[i]; // output is \x7f\x00\x00\x00


    mv.integer_ = 0x7f;
    for(int i=0;i<4;i++)
        std::cout << mv.mem_[i]; // output is \x7f\x00\x00\x00

    // now i try with 0x80
    mv.integer_ = (int)'\x80'
    for(int i=0;i<4;i++)
        std::cout << mv.mem_[i]; // output is \x80\xff\xff\xff

    mv.integer_ = 0x80;
    for(int i=0;i<4;i++)
        std::cout << mv.mem_[i]; // output is \x80\x00\x00\x00

}

I tested it with both GCC4.6 and MSVC2010 and results was same. When I try with values less than 0x80 output is correct but with values bigger than 0x80, left three bytes are '\xff'.

CPU : Intel 'core 2 Duo' Endianness : little OS : Ubuntu 12.04LTS (64bit), Windows 7(64 bit)

Upvotes: 1

Views: 2529

Answers (4)

Filip Ros&#233;en
Filip Ros&#233;en

Reputation: 63862

It's implementation-specific whether type char is signed or unsigned.


Assigning a variable of type char the value of 0xFF might either yield 255 (if type is really unsigned) or -1 (if type is really signed) in most implementations (where the number of bits in char is 8).

Values less, or equal to, 0x7F (127) will fit in both an unsigned char and a signed char which explains why you are getting the result you are describing.


#include <iostream>
#include <limits>

int
main (int argc, char *argv[])
{
  std::cerr << "unsigned char: "
            << +std::numeric_limits<unsigned char>::min ()
            << " to "
            << +std::numeric_limits<unsigned char>::max ()
            << ", 0xFF = "
            << +static_cast<unsigned char> ('\xFF')
            << std::endl;

  std::cerr << "  signed char: "
            << +std::numeric_limits<signed char>::min ()
            << " to "
            << +std::numeric_limits<signed char>::max ()
            << ", 0xFF = "
            << +static_cast<signed char> ('\xFF')
            << std::endl;
}

typical output

unsigned char: 0 to 255, 0xFF = 255
  signed char: -128 to 127, 0xFF = -1

To circumvent the problem you are experiencing explicitly declare your variable as either signed or unsigned, in this case casting your value into a unsigned char will be sufficient:

mv.integer_ = static_cast<unsigned char> ('\xFF'); /* 255, NOT -1 */

side note: you are invoking undefined behaviour when reading a member of a union that is not the last member you wrote to. the standard doesn't specify what will be going on in this case. sure, under most implementations it will work as expected. accessing union.mem_[0] will most probably yield the first byte of union.integer_, but this is not guarenteed.

Upvotes: 4

Joachim Isaksson
Joachim Isaksson

Reputation: 181027

Basically, 0xff stored in a signed 8 bit char is -1. Whether a char without signedor unsigned specifier is signed or unsigned depends on the compiler and/or platform and in this case it seems to be.

Cast to an int, it keeps the value -1, which stored in a 32 bit signed int is 0xffffffff.

0x7f on the other hand stored in an 8 bit signed char is 127, which cast to a 32 bit int is 0x0000007f.

Upvotes: 1

Mats Petersson
Mats Petersson

Reputation: 129474

Because '\xff' is a signed char (default for char is signed in many architectures, but not always) - when converted to an integer, it is sign-extended, to make it 32-bit (in this case) int.

In binary arithmetic, nearly all negative representations use the highest bit to indicate "this is negative" and some sort of "inverse" logic to represent the value. The most common is to use "two's complement", where there is no "negative zero". In this form, all ones is -1, and the "most negative number" is a 1 followed by a lot of zeros, so 0x80 in 8 bits is -128, 0x8000 in 16 bits is -32768, and 0x80000000 is -2147 million (and some more digits).

A solution, in this case, would be to use static_cast<unsigned char>('\xff').

Upvotes: 2

Mat
Mat

Reputation: 206851

The type of '\xff' is char. char is a signed integral type on a lot of platforms, so the value of '\xff is negative (-1 rather than 255). When you convert (cast) that to an int (also signed), you get an int with the same, negative, value.

Anything strictly less than 0x80 will be positive, and you'll get a positive out of the conversion.

Upvotes: 3

Related Questions