Reputation: 910

Why does printing char sometimes print 4 bytes number in C

Why does printing a hex representation of char to the screen using printf sometimes prints a 4 byte number?

This is the code I have written

#include <stdio.h>
#include <stdint.h>

int main(void) {
    char testStream[8] = {'a', 'b', 'c', 'd', 0x3f, 0x9d, 0xf3, 0xb6};
   
    int i;
    for(i=0;i<8;i++){
      printf("%c = 0x%X, ", testStream[i], testStream[i]);
    }
    
    return 0;
}

And following is the output:

a = 0x61, b = 0x62, c = 0x63, d = 0x64, ? = 0x3F, � = 0xFFFFFF9D, � = 0xFFFFFFF3, � = 0xFFFFFFB6

Upvotes: 7

Answers (5)

chqrlie

Reputation: 144969

char is signed on your platform: the initializer 0x9d for the 6th character is larger than CHAR_MAX (157 > 127), it is converted to char as a negative value -99 (157 - 256 = -99) stored at offset 5 in textStream.

When you pass textStream[5] as an argument to printf, it is first promoted to int, with a value of -99. printf actually expects an unsigned int for the "%X" format specifier.

On your architecture, int is 32 bits with 2's complement representation of negative values, hence the value -99 passed as int is interpreted as 4294967197 (2^32-99), whose hexadecimal representation is 0xFFFFFF9D. On a different architecture, it could be something else: on 16-bit DOS, you would get 0xFF9D, on a 64-bit Cray you might get 0xFFFFFFFFFFFFFF9D.

To avoid this confusion, you should cast the operands of printf as (unsigned char). Try replacing your printf with this:

printf("%c = 0x%2X, ", (unsigned char)testStream[i], (unsigned char)testStream[i]);

Upvotes: 1

Icemanind

Reputation: 48716

On your machine, char is signed by default. Change the type to unsigned char and you'll get the results you are expecting.

A Quick explanation on why this is

In computer systems, the MSB (Most Significant Bit) is the bit with the highest value (the left most bit). The MSB of a number is used to determine if the number is positive or negative. Even though a char type is 8-bits long, a signed char only can use 7-bits because the 8th bit determines if its positive or negative. Here is an example:

Data Type: signed char
  Decimal: 25
   Binary: 00011001
           ^
           |
           --- Signed flag. 0 indicates positive number. 1 indicates negtive number

Because a signed char uses the 8th bit as a signed flag, the number of bits it can actually use to store a number is 7-bits. The largest value you can store in 7-bits is 127 (7F in hex).

In order to convert a number from positive to negative, computers use something called two's-compliment. How it works is that all the bits are inverted, then 1 is added to the value. Here's an example:

Decimal: 25
 Binary: 00011001

Decimal: -25
 Binary: 11100111

When you declared char testStream[8], the compiler assumed you wanted signed char's. When you assigned a value of 0x9D or 0xF3, those numbers were bigger then 0x7F, which is the biggest number that can fit into 7-bits of a signed char. Therefore, when you tried to printf the value to the screen, it was expanded into an int and filled with FF's.

I hope this explanation clears things up!

Upvotes: 1

M.M

Reputation: 141628

From the C standard (C11 6.3.2.1/8) description of %X:

The unsigned int argument is converted to unsigned octal (o), unsigned decimal (u), or unsigned hexadecimal notation (x or X) in the style dddd; the letters abcdef are used for x conversion and the letters ABCDEF for X conversion.

You did not provide an unsigned int as argument¹, therefore your code causes undefined behaviour.

In this case the undefined behaviour manifests itself as the implementation of printf writing its code for %X to behave as if you only ever pass unsigned int. What you are seeing is the unsigned int value which has the same bit-pattern as the negative integer value you gave as argument.

There's another issue too, with:

char testStream[8] = {'a', 'b', 'c', 'd', 0x3f, 0x9d, 0xf3, 0xb6};

On your system the range of char is -128 to +127. However 0x9d, which is 157, is out of range for char. This causes implementation-defined behaviour (and may raise a signal); the most common implementation definition here is that the char with the same bit-pattern as (unsigned char)0x9d will be selected.

¹ Although it says unsigned int, this section is usually interpreted to mean that a signed int, or any argument of lower rank, with a non-negative value is permitted too.

Upvotes: 2

Andreas Bombe

Reputation: 2470

char appears to be signed on your system. With the standard "two's complement" representation of integers, having the most significant bit set means it is a negative number.

In order to pass a char to a vararg function like printf it has to be expanded to an int. To preserve its value the sign bit is copied to all the new bits (0x9D → 0xFFFFFF9D). Now the %X conversion expects and prints an unsigned int and you get to see all the set bits in the negative number rather than a minus sign.

If you don't want this, you have to either use unsigned char or cast it to unsigned char when passing it to printf. An unsigned char has no extra bits compared to a signed char and therefore the same bit pattern. When the unsigned value gets extended, the new bits will be zeros and you get what you expected in the first place.

Upvotes: 11

trbvm

Reputation: 141

What seem to happen here is implicit char -> int -> uint cast. When the positive char is being converted to int nothing bad happens. But in case of the negative chars such as 0x9d, 0xf3, 0xb6 cast to int will keep them negative and therefore they become 0xffffff9d, 0xfffffff3, 0xffffffb6. Not that actual value is not changed, that is 0xffffff9d == -99 and 0x9d == -99. To print them properly you can change your code to

printf("%c = 0x%X, ", testStream[i] & 0xff, testStream[i] & 0xff);

Upvotes: 0

Why does printing char sometimes print 4 bytes number in C

Answers (5)

Related Questions