computronium
computronium

Reputation: 447

C - where is the signed bit in the negative char—(-128)—when it's always encoded as the 2's complement of its binary?

I am having trouble reconciling the two facts mentioned above.

If we look at the example of -128 here, the following steps are taken while encoding it

  1. figure out 128's binary equivalent: 10000000
  2. take 1's complement: 01111111
  3. add 1 to 1's complement to get 2's complement: 10000000

My question is: Where is the sign bit for a negative integer? In other words, I want to understand how 10000000 gets decoded to -128 and not -0

  1. If the left most 1 (MSB) is used to code the negative sign of -128, doesn't that leave us with 7-bit binary 0000000 whose decimal equivalent is 0 (and not -128)?
  2. Or does the computer (figuratively, of course)--when performing computations on negative integers stored at some memory location--go through steps 1-3 above in reverse order to decode the value of the 1-byte char whenever it sees 1 as the most significant bit (as opposed to 0)?
  3. Or is the MSB encoding both the sign and the bit at the 8th place (2^7) in -128.

I already saw this question, but mine's different because I get perfectly well that +128 can't be stored in 1-byte because its signed binary would translate to 0100000000 which would require 9 bits.

Upvotes: 2

Views: 650

Answers (5)

Aganju
Aganju

Reputation: 6395

The exact layout depends on the machine (CPU), but not on C. The compiler is built for a specific machine, and knows how to instruct it to handle it right.

Upvotes: -3

Eric Postpischil
Eric Postpischil

Reputation: 222724

If we look at the example of -128 here, the following steps are taken while storing it…

Not when storing it, when calculating/converting it. When a compiler is processing -128 in source code or when we see it on paper and work with it, we do whatever computations we want. We can use bits or digits or marks on paper for whatever we want. When we produce the final answer, then the bits in that final answer have their final meanings. The intermediate steps do not have to use bits in the same way.

Given “128”, we calculate this is 10000000 in pure binary (no sign). Then we can calculate its two’s complement representation by complementing the bits to 01111111 and adding 1 (still in pure binary, no sign) to get 10000000. Then these same bits are the two’s complement representation.

When the byte is interpreted, including when it is used in arithmetic or converted from two’s complement representation to decimal, the high bit will be interpreted as a sign bit. But, again, we do not need to use bits in the same way throughout the computation. We can take 10000000, see the high bit is set to tell us the number is negative, and then take its two’s complement as before: Complement the bits to 01111111 and then add one to make 10000000. Now we have the same bits, but they are a pure binary number, with no sign. They represent 128, and we know it is negative because we observed the original sign bit earlier.

Also note that signed char x and unsigned char y use the same bit patterns to represent different values. When x has the bit pattern 11111111, it represents −1. When y has the bit pattern 11111111, it represents 255. To make this work, the compiler will use different instructions for operations with x than for operations with y. There are different instructions for working with signed types than for working with unsigned types. (Many of them overlap in large part; addition and subtraction are often performed with the same instructions, but the flag results are interpreted differently to detect overflow and other conditions.)

Additionally, for this single-byte example, the compiler generally does not work with it as a char. In source text, 128 is an int constant. Internally, a compiler likely converts 128 to a 32-bit int and then negates it to make −128, with bits 11111111111111111111111110000000, and then, to store it in a signed char, it takes the low eight bits, 10000000. (This may vary depending on the compiler.)

Interestingly, this boundary issue does affect the type of -2147483648. Consider a C implementation that uses 32-bit int and a 64-bit long. −2,147,483,648 is representable in a 32-bit int, but, in the C grammar, -2147483648 is not a constant but is a combination of - and 2147483648. And, since 2,147,483,648 is not representable in 32 bits, it is long constant. So the type of -2147483648 is long. You can verify this with:

printf("%zu %zu\n", sizeof -2147483647, sizeof -2147483648);

It will print "4 8" in C implementations with a 32-bit int.

(Which raises the issue of how is INT_MIN defined. It must have the value −2,147,483,648, but it must have type int.)

Upvotes: 2

John Bode
John Bode

Reputation: 123468

While two’s complement is by far the most common way to represent signed integer values, it is not the only way, and C does not require two’s complement representation.

Note that you cannot represent a signed 128 in only 8 bits in any representation - you can either represent the range [-127..127] in ones’ complement or sign magnitude, or the range [-128...127] in two’s complement. So by definition you need more than 8 bits to represent a signed 128:

        two’s      ones’       sign-magnitude

 125    01111101   01111101    01111101
 126    01111110   01111110    01111110
 127    01111111   01111111    01111111
-128    10000000   n/a         n/a
-127    10000001   10000000    11111111
-126    10000010   10000001    11111110
-125    10000011   10000010    11111101

Upvotes: 1

Sahyog Vishwakarma
Sahyog Vishwakarma

Reputation: 410

For any data types which is using integer value like "int", "short", "byte", "char", "long" except "unsigned", their values ranges from:( -1 * 2(total bits - 1)) to (2(total bits - 1)) - 1. and left most bit is used as sign bit.

for char datatype its size is 1 byte s no. of bits are 8 bits. 1 Left most bit used for sign and remaining for values. so the values we get without sign is from

0 to 127 which is (00000000)2 to (01111111)2

and when it is along with sign bit it will range from

-128 to -1 which is (10000000)2 to (11111111)2.

Upvotes: -1

chux
chux

Reputation: 153488

Where is the sign bit for a negative integer?

Door #3: "Or is the MSB encoding both the sign and the bit at the 8th place (2^7) in -128

With an 8-bit signed char, encoded as 2's complement, M is 7 and ...

the sign bit has a value of -(2M) (C17dr § 6.2.6.2 2)

10000000 is -128 + 0 * 64 + 0 * 32 + 0 * 16 + 0 * 8 + 0 * 4 + 0 * 2 + 0 * 1 --> -128

Upvotes: 5

Related Questions