Reputation: 453
Is possible to read a file after its EOF?
I am reading a file which could contain an EOF character before its ending or multiple EOF characters. The file is a simple txt, and I am able to know the number of characters using fsize but looks like getc returns EOF (or -1) from the EOF to the end of the file.
int c = 0;
char x;
FILE *file = fopen("MyTextFile.txt", "r");
off_t size = fsize("MyTextFile.txt");
while (c < size) {
x = getc(file);
if (x != -1)
printf("%c ", x);
else
printf("\nFOUND EOF!\n");
c++;
}
fclose(file);
Unfortunately, even if I'm sure the file content continues after the EOF I cannot read the rest.
SOLVED: Reading using "rb" instead of "r" and using x as int allowed me to read the whole file, including multiple EOF. Not sure if it's a trick or if it's something allowed, but works.
Upvotes: 2
Views: 4641
Reputation: 123468
7.21 Input/output <stdio.h>
7.21.1 Introduction
...
3 The macros are...
EOF
which expands to an integer constant expression, with typeint
and a negative value, that is returned by several functions to indicate end-of-file, that is, no more input from a stream;
EOF
isn't a character in the file itself; it's a value returned by the input function to indicate that there is no more input available on the stream; you can't read past it, because there's nothing to read.
Upvotes: 0
Reputation: 263267
Logically, there is no data after EOF (end of file).
Note that EOF
is not a character; it's a special value returned by getc()
after an end-of-file or error condition has been encountered, a value returned instead of a character value.
You haven't said so in the question, but my guess is that you have a Windows text file with one or more embedded Ctrl-Z (0x1a
) characters. That's the only thing I can think of that's consistent with your description.
In Windows, a Ctrl-Z character in a text file is treated as the end of the file. (This goes back to earlier systems where the end of the data was not clearly marked, because the file system only recorded the number of blocks.) Ctrl-Z is not an EOF character; it's a character value that, on Windows, triggers and end-of-file condition and causes getc()
to return EOF
.
Basically you have a malformed text file, and you should probably just fix it and/or fix whatever generated it. But if you really need to read data from it, I suggest opening it in binary mode rather than text mode. You'll then see each CR/LF end-of-line marker as two characters ('\r'
, '\n'
rather than just '\n'
), and Ctrl-Z (0x1a
) is just another byte value. Since you're not really treating the file as text (the "text" ends at the first Ctrl-Z), it makes sense to read it in binary mode.
There are probably tricks you can play to read past the Ctrl-Z in text mode; for example clearerr()
is likely to work. But doing that goes beyond what the C standard guarantees -- which may or may not be a problem for you.
Also, you should definitely use the symbol EOF
, not the "magic number" -1
. It's not even guaranteed that EOF == -1
, and using the symbol EOF
will make your code much clearer.
Finally, thanks to Mark Plotnick's for pointing out in a comment something I should have noticed myself. getc()
returns an int
result; you're assigning it to a char
object. x
needs to be of type int
, notchar
. This is necessary so you can distinguish between the value of EOF
and the value of any actual character.
Upvotes: 7
Reputation: 11492
Your code is incomplete so it's hard to say what the problem is, but I would suggest:
x
is of type int
Upvotes: 0