betteroutthanin
betteroutthanin

Reputation: 7546

Why char cannot be used

I'm new to C. From the book there is a sample code:

#include <stdio.h>

main() {
  int c;
  c = getchar();

  while (c != EOF) {
    putchar(c);
    c = getchar();
  }
}

The author writes a sentence like this:

We can't use char since c must be big enough to hold EOF in addition to any possible char. Therefore we use int.

Trying to understand, I modified the code like this:

#include <stdio.h>

main() {
  char c=getchar();

  while (c != EOF) {
    putchar(c);
    c = getchar();
  }

  if (c == EOF) {
    putchar('*');
  }
}

When I press Ctrl+D, * was printed, which means c holds EOF, which confuses me. Can anybody explain a little bit about this?

Upvotes: 1

Views: 243

Answers (5)

Colin D Bennett
Colin D Bennett

Reputation: 12084

Because EOF is a special sentinel value (implementation defined, but usually -1 as an int type), it can't be distinguished from the value 255 if stored in a char variable. You need a type larger than 8 bits in order to represent all possibly byte values returned by getchar(), plus the special sentinel value EOF. There are 257 different possible return values from getchar().

Also, on a related note, character literals in C like 'a' have the type int. In C++, on the other hand, character literals have the type char. So you will see characters usually passed to and returned from C Standard Library functions as int types.

Upvotes: 4

WhozCraig
WhozCraig

Reputation: 66194

The result of your getchar() is being stored as achar after performing a conversion by-value. The value in this case, is EOF likely (-1).

6.3.1.3-p1 Signed and unsigned integers

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

This is also accounted for during value comparison in your while-condition through value comparison via conversion:

6.5.9-p4 Equality Operators

If both of the operands have arithmetic type, the usual arithmetic conversions are performed. Values of complex types are equal if and only if both their real parts are equal and also their imaginary parts are equal. Any two values of arithmetic types from different type domains are equal if and only if the results of their conversions to the (complex) result type determined by the usual arithmetic conversions are equal.

Both char and int are integer types. Both can hold the integer-value (-1) on your platform. Therefore your code "works".

Upvotes: 1

Nerf Herder
Nerf Herder

Reputation: 414

The type char is an unsigned 8-bit value (actually I think it can be 7 bits for the standard ASCII table, but I have never seen it implemented like that).

EOF is implementation defined, but often -1. That is a signed number (0xFFFFFFFF in a 32 bit machine). Most compilers will probably truncate that to 0xFF to compare to a char, but that's also a valid (but rarely used) character, so you can't really be sure if you have hex value 255 or EOF (-1).

In addition, some code may be written to look for a return value of <0 to stop reading. Obviously a char will never be less than zero.

Upvotes: 1

Shahbaz
Shahbaz

Reputation: 47493

When you press CTRL-D, getchar() returns a value that doesn't fit in char. So char takes as much of it as it can. Let's assume a common number for EOF: 0xFFFFFFFF, in other words, -1. When this value is assigned to char (assuming it's signed), it will get a truncated value out of it, 0xFF, which is also -1.

So your if becomes:

if ((char)-1 == (int)-1)

the (char)-1 gets promoted to int to be able to compare it with (int)-1. Since on promotion of signed values, they get sign extended (to keep the original signed value), you end up comparing -1 and -1 which is true.

That said, this is only lucky. If you actually read a character with value 0xFF, you would mistake it with EOF. Not to mention EOF may not be -1 in the first place. All of this aside, you shouldn't let your program truncate a value when assigning to a variable (unless you know what you are doing).

Upvotes: 3

glglgl
glglgl

Reputation: 91017

char is, at least on your system, signed. Therefore, it can hold values from -128 to 127.

EOF is -1 and is, therefore, one of them.

You code works, as the -1 is retained. But as soon as you input the character which is equivalent to 255, you get erroneously -1 as well.

Upvotes: 1

Related Questions