Reputation: 3650
Having an odd problem - one of my app suites has to read/write gzip-compressed files that are used on both Windows and Linux, and I am finding that the files I generate using zlib on Linux are 2-3 times larger than those I generate using GZipStream
on Windows. They read perfectly on either platform, so I know that the compression is correct regardless of which platform created the file. The thing is, the files are transferred across the network at various times, and obviously file size is a concern.
My question is:
GZipStream
does not provide a way to specify the compression level like you can with zlib, but I am using maximum compression on the zlib side. Shouldn't I see relatively the same file size, assuming that GZipStream is written to use maximum compression as well?Upvotes: 2
Views: 1860
Reputation: 48686
I think the reason you are experiencing this is not because of the compression algorithm used, but because of how the files are compressed. From the zLib manual:
"The zlib format was designed to be compact and fast for use in memory and on communications channels. The gzip format was designed for single- file compression on file systems, has a larger header than zlib to maintain directory information, and uses a different, slower check method than zlib."
I think what is happening is that the files on your linux machine are being Tar'red together into 1 file, then that one file is being compressed. In WIndows, I think it compresses each individual file, then stores them compressed into 1 file.
This is my theory, but have nothing to really support it. Thought I might try some trial tests at home later, just to fill my curiosity.
Upvotes: 1
Reputation: 3650
And the answer is .... the Linux version was never compressing the data to begin with. Took a lot of debugging to find the bug that caused it, but after correcting it, the sizes are now comparable on both platforms.
Upvotes: 1