user1149549
user1149549

Reputation:

Simple Character Interpretation In C

Here is my code

 #include<stdio.h>

 void main()
 {
     char ch = 129;
     printf("%d", ch);
 }

I get the output as -127. What does it mean?

Upvotes: 3

Views: 811

Answers (9)

whtlnv
whtlnv

Reputation: 2227

It means that char is an 8-bit variable that can only hold 2^8 = 256 values, since the declaration is char ch, ch is a signed variable, which means it can store 127 negative and positive values. when you ask to go over 127 then the value starts over from -128.

Think of it like some arcade games where you go from one side of the screen to the other:

ch = 50;

                                    ----->                        50 is stored
      |___________________________________|___________|           since it fits
    -128                       0         50          127          between -127
                                                                  and 128

ch = 129;

                                                    ---           129 goes over
      -->                                                         127 by 2, so
      |__|____________________________________________|           it 'lands' in
    -128  -127                 0                     127          -127

BUT!! you shouldn't rely on this since it's undefined behaviour!


In honor of Luchian Grigore here's the bit representation of what's happening:

A char is a variable that will hold 8-bits or a byte. So we have 8 0's and 1's struggling to represent whatever value you desire. If the char is a signed variable it will represent whether it's a positive or negative number. You probably read about the one bit representing the sign, that's an abstraction of the true process; in fact it is only one of the first solutions implemented in electronics. But such a trivial method had a problem, you would have 2 ways of representing 0 (+0 and -0):

0 0000000     ->    +0        1 0000000     ->    -0                    
^                             ^ 
|_ sign bit 0: positive       |_ sign bit 1: negative

Inconsistencies guaranteed!! So, some very smart folks came up with a system called Ones' Complement which would represent a negative number as the negation (NOT operation) of its positive counterpart:

01010101      ->    +85
10101010      ->    -85

This system... had the same problem. 0 could be represented as 00000000 (+0) and 11111111 (-0). Then came some smarter folks who created Two's Complement, which would hold the negation part of the earlier method and then add 1, therefore removing that pesky -0 and giving us a shiny new number to our range: -128!. So how does our range look now?

00000000     +0
00000001     +1
00000010     +2
...
01111110     +126
01111111     +127
10000000     -128
10000001     -127
10000010     -126
...
11111110     -2
11111111     -1

So, this should give an idea of what's happening when our little processor tries to add numbers to our variable:

 0110010     50                   01111111     127
+0000010    + 2                  +00000010    +  2
 -------     --                   --------     ---
 0110100     52                   10000001    -127
     ^                                  ^       ^
     |_ 1 + 1 = 10          129 in bin _|       |_ wait, what?!

Yep, if you review the range table above you can see that up to 127 (01111111) the binary was fine and dandy, nothing weird happening, but after the 8'th bit is set at -128 (10000000) the number interpreted no longer held to its binary magnitude but to the Two's Complement representation. This means, the binary representation, the bits in your variable, the 1's and 0's, the heart of our beloved char, does hold a 129... its there, look at it! But the evil processor reads that as measly -127 cause the variable HAD to be signed undermining all its positive potential for a smelly shift through the real number line in the Euclidean space of dimension one.

Upvotes: 4

Johan Lundberg
Johan Lundberg

Reputation: 27088

On your system: char 129 has the same bits as the 8 bit signed integer -127. An unsigned integer goes from 0 to 255, and signed integer -128 to 127.

Related (C++):

You may also be interested in reading the nice top answer to What is an unsigned char?

As @jmquigley points out. This is strictly undefined behavior and you should not rely on it. Allowing signed integer overflows in C/C++

Upvotes: 1

Luchian Grigore
Luchian Grigore

Reputation: 258688

It means you ran into undefined behavior.

Any outcome is possible.

char ch=129; is UB because 129 is not a representable value for a char for you specific setup.

Upvotes: 2

Eregrith
Eregrith

Reputation: 4366

This comes from the fact that a char is coded on one byte, so 8 bits of data.

In fact char has a value coded on 7 bits and have one bit for the sign, unsigned char have 8 bits of data for its value.

This means:

Taking abcdefgh as 8 bits respectively (a being the leftmost bit, and h the rightmost), the value is encoded with a for the sign and bcdefgh in binary format for the real value:

42(decimal) = 101010(binary) stored as : abcdefgh 00101010

When using this value from the memory : a is 0 : the number is positive, bcdefgh = 0101010 : the value is 42

What happens when you put 129 :

129(decimal) = 10000001(binary) stored as : abcdefgh 10000001

When using this value from the memory : a is 0 : the number is negative, we should substract one and invert all bits in the value, so (bcdefgh - 1) inverted = 1111111 : the value is 127 The number is -127

Upvotes: 1

Lundin
Lundin

Reputation: 215350

Whether a plain char is signed or unsigned, is implementation-defined behavior. This is a quite stupid, obscure rule in the C language. int, long etc are guaranteed to be signed, but char could be signed or unsigned, it is up to the compiler implementation.

On your particular compiler, char is apparently signed. This means, assuming that your system uses two's complement, that it can hold values of -128 to 127.

You attempt to store the value 129 in such a variable. This leads to undefined behavior, because you get an integer overflow. Strictly speaking, anything can happen when you do this. The program could print "hello world" or start shooting innocent bystanders, and still conform to ISO C. In practice, most (all?) compilers will however implement this undefined behavior as "wrap around", as described in other answers.

To sum it up, your code relies on two different behaviors that aren't well defined by the standard. Understanding how the result of such unpredictable code ends up in a certain way has limited value. The important thing here is to recognize that the code is obscure, and learn how to write it in a way that isn't obscure.

The code could for example be rewritten as:

unsigned char ch = 129;

Or even better:

#include <stdint.h>
...
uint8_t ch = 129;

As a rule of thumb, make sure to follow these rules in MISRA-C:2004:

6.1 The plain char type shall be used only for the storage and use of character values.

6.2 signed and unsigned char type shall be used only for the storage and use of numeric values.

Upvotes: 0

David Grayson
David Grayson

Reputation: 87541

Your char is most likely an 8-bit signed integer that is stored using Two's complement. Such a variable can only represent numbers between -128 and 127. If you do "127+1" it wraps around to -128. So 129 is equivalent to -127.

Upvotes: 1

Brian Roach
Brian Roach

Reputation: 76918

char is 8 bits, signed. It can only hold values -128 to 127. When you try and assign 129 to it you get the result you see because the bit that indicates signing is flipped. Another way to think of it is that the number "wraps" around.

Upvotes: 0

Some programmer dude
Some programmer dude

Reputation: 409482

The type char can be either signed or unsigned, it's up to the compiler. Most compilers have it as `signed.

In your case, the compiler silently converts the integer 129 to its signed variant, and puts it in an 8-bit field, which yields -127.

Upvotes: 0

Luk&#225;š Lalinsk&#253;
Luk&#225;š Lalinsk&#253;

Reputation: 41326

The char type is a 8-bit signed integer. If you interpret the representation of unsigned byte 129 in the two's complement signed representation, you get -127.

Upvotes: 0

Related Questions