Reputation: 389
#include <stdio.h>
int main()
{
int c = getchar();
while (c != EOF) {
putchar(c);
c = getchar();
}
return 0;
}
The problem is distinguishing the end of input from valid data. The solution is that getchar returns a distinctive value when there is no more input, a value that cannot be confused with any real character. This value is called EOF, for ``end of file''. We must declare c to be a type big enough to hold any value that getchar returns. We can't use char since c must be big enough to hold EOF in addition to any possible char. Therefore we use int.
From 'The C Programming Language' book. I have three questions.
Firstly, why do I get the output ^\Quit (core dumped)
when I press the keys ctrl
and 4
simultaneously while the above program runs? I'm using a GNU/Linux machine.
Secondly, I wrote a program like this :
#include <stdio.h>
int main()
{
printf("The part before EOF\n");
putchar(EOF);
printf("The part after EOF\n");
}
Then compiled this as 'eof.out' and changed int c = getchar();
in the program from the book into char c = getchar();
, saved it and then compiler the program as 'copy.out'.
When I run the command ./eof.out | ./copy.out
in the terminal the output I get is :
The part before EOF
Meaning the program 'copy.out' worked correctly since it didn't print the second printf but the passage above from the book indicates that there should've been some kind of failure since I changed the int
into char
so what happened?
Thirdly, when I change the char c = getchar();
into double c = getchar();
and run the command ./eof.out | ./copy.out
the output I get is :
The part before EOF
�The part after EOF
Why didn't putchar(EOF);
stop copy.out ? Doesn't a double
have more bytes than both int
and char
? what is happening?
Upvotes: 3
Views: 318
Reputation: 224546
getchar
and putchar
work with unsigned char
values, not char
values, so declaring c
to be the char
type causes a valid character 255 to be confused with EOF
.
To simplify explanation, this answer assumes a common C implementation, except where stated: char
is signed and eight bits, EOF
is −1, and conversions to signed integer types modulo 2w, where w is the width of the type, in bits. The C standard permits some variations here, but these assumptions are typical in common C implementations and match the behavior reported in the question.
Consider this code for eof.c
from the question:
#include <stdio.h>
int main()
{
printf("The part before EOF\n");
putchar(EOF);
printf("The part after EOF\n");
}
When this program executes putchar(EOF)
, what happens is:
putchar
converts EOF
to unsigned char
. This is specified in C 2018 7.21.7.3 (by way of 7.21.7.7 and 7.21.7.8).unsigned char
yields 255, because conversion to an unsigned eight-bit integer type wraps modulo 256, and −1 + 256 = 255.… changed
int c = getchar();
in the program from the book intochar c = getchar();
, saved it and then compiler the program as 'copy.out'. When I run the command./eof.out | ./copy.out
in the terminal the output I get is :The part before EOF
With c = getchar();
, what happens when byte 255 is read and c = getchar()
is evaluated is:
getchar
returns 255. Note that it the character code as an unsigned char
value, per C 2018 7.21.7.1 (by way of 7.21.7.5 and 7.21.7.6).c
, 255 is converted to the char
type. Per the assumption above, this wraps modulo 256, producing −1.−1 is the value of EOF
, so c != EOF
is false, so the loop ends, and the program exits.
Why didn't
putchar(EOF)
; stop copy.out ? Doesn't adouble
have more bytes than bothint
andchar
? what is happening?
With double c
, the value assigned to c
is the value returned from getchar
; there is no change due to the destination type being unable to represent all the values getchar
returns. When getchar
returns the valid character code 255, c
is set to 255, and the loop continues. When getchar
returns the code −1 for end-of-file, c
is set to −1, and the loop exits.
… the book indicates that there should've been some kind of failure since I changed the
int
intochar
…
The passage from the book does not say there should be some kind of failure. It says EOF
is “a value that cannot be confused with any real character”; it does not say you cannot convert EOF
to a char
. If your C implementation uses an unsigned char
type, the conversion wraps the value modulo 2w, where w is the number of bits in a char
, usually eight, so modulo 256. For example, −1 maps to 255. If your C implementation uses a signed char
, the conversion is implementation-defined. So your eof.c
program does not output an end-of-file indication when putchar(EOF)
is evaluated. Instead, it outputs the character code 255.
Upvotes: 5