Reputation: 1071
i am programming C on windows. i encountered this problem while trying to read a .tar.gz file.
the file looks like (opened with notepad++):
and the code i used to read is as follow:
iFile = fopen("my.tar.gz", "r");
while ((oneChar = fgetc(iFile)) != EOF) {
printf("%c", oneChar);
}
the following figure shows the result of my program:
The problem I have is, the result only has several lines while the original file has thousands of lines (6310 lines, as you can see). My guess is that the .tar.gz file contains some strange characters (like an EOF in the middle of the file?).
My question is why notepad++ can display the whole file while my program can not. And is there a solution to this problem?
Upvotes: 0
Views: 3517
Reputation: 1
A .tar.gz
file is conventionally a gnu-zipped compression of some tar archive. It is of course a binary file (any '\n'
or '\r'
inside it does not delimit lines, and '\0'
may appear inside), so you need to open it with
iFile = fopen("my.tar.gz", "rb");
if (!iFile) { perror("my.tar.gz"); exit(EXIT_FAILURE); }
Also, feof(iFile)
is valid only after some <stdio.h>
input operation so while(!feof(iFile))
is wrong just after the fopen
...
But that won't help you extracting any files from the archive.
So you need to first uncompress that file then extract or list the relevant archives files in it.
You could find libraries (and command executables) for both the uncompression (zlib
library, gunzip
or zcat
commands) and the archive extraction (libarchive
library, or libtar
, or tar
command) steps.
If your operating system provides it, you could consider using appropriately the popen
function.
BTW using putchar(oneChar)
is shorter, simpler and faster than printf("%c", oneChar)
....
Upvotes: 4
Reputation: 60017
Usually the file ending tar.gz
is a compresses tar file )a binary file). Therefore I would suggest you use popen
(http://linux.die.net/man/3/popen) instead of fopen
to open the file using a command.
i.e.
iFile = popen("zcat my.tar.gz | tar xf -", "r");
Upvotes: 3