Reputation: 65
I'm currently trying to transfer a .tar.gz file from a server to a client through sockets in C but I'm running into an error where reading from a tar.gz with gzopen + gzread and writing it to a new tar.gz file with gzwrite fails results in a tar.gz file that's corrupted and not even the same size as the original.
At first I thought it was something to do with my socket logic, but I tested it on a smaller scale and just read and immediately wrote it to the same directory and even then the error persists. The following is some example code to demonstrate the issue:
gzFile gz = gzopen("tar.tar.gz", "r");
// tar.tar contains one text file, tarred with "tar -cvf tar.tar text.txt"
// tar.tar.gz created with "gzip tar.tar"
struct stat st;
stat("tar.tar.gz", &st); // get size
unsigned int gz_buffer_size = st.st_size;
printf(".gz size: %d\n", gz_buffer_size);
unsigned char *gz_buffer = malloc(gz_buffer_size);
gzread(gz, buffer, buffer_size);
gzclose(gz);
gzFile test = gzopen("test.tar.gz", "w");
printf("wrote to test: %d\n", gzwrite(test, gz_buffer, gz_buffer_size));
gzclose(test);
Upon running the above, the program prints to stdout that the size of my .gz is 169 bytes and even gzwrite prints that it wrote 169 bytes to the new .gz file. So why is it that when I then run
stat -c %s test.tar.gz
I get that the size of test.tar.gz is 24?
Upvotes: 0
Views: 1235
Reputation: 52336
A few things here:
You're statting a compressed file to get its size, and reading some number of uncompressed bytes from it. You don't provide any definition or initialization of buffer_size
, but you're probably not reading the entire file, just a portion of it.
You're completely ignoring gzread()
's return value so you don't know how many bytes were actually read.
You're reading into buffer
and writing gz_buffer
, which is uninitialized.
You're then compressing and writing that uninitialized memory to another file. gzwrite()
returns the number of bytes compressed, not the length of the compressed fragment (Which it can't know because of things like blocks, padding etc. that depend on future writes if any). It's not unreasonable that 169 uncompressed bytes goes down to 24 compressed bytes.
You're not doing any error checking; all of your functions should have checks to make sure they succeeded before doing anything with what they return/set.
Upvotes: 1