Reputation: 30128
Wikipedia states(wrongly apparently at least for real world status) that gzip format demands that last 4 bytes are uncompressed size (mod 4GB). I have fond the credible answer on SO that explains that sometimes there is junk at the end of the gzip data so you can not reply on last 4 bytes being size.
Unfortunately this matches my experiments(both terminal gzip and 7zip archiver add 0x0A byte for my small test example).
My question is what is the reason for this gzip and 7zip doing this? Obviously they do it like that because they are written to do that, but I wonder about the motivation to break the format specification. I know that some formats have padding requirements, but I found nothing for gzip.
edit:process:
echo "Testing rocks:) Debugging sucks :(" >> test_data
rm test_data.gz
gzip -6 test_data
vim -c "noautocmd edit test_data.gz"
in vim: :%!xxd -c 4
and last 5 bytes are size(35) and 0x0a (23 hex=35, then 00 00 00 0a)
7zip process is just using GUI to make a archive.
Upvotes: 0
Views: 299
Reputation: 7981
Your testing process is wrong. Vim is what adds 0x0A
to the end of the file. Here is a simpler test, using xxd
directly (why did you even use Vim?):
echo "Testing rocks:) Debugging sucks :(" >> test_data
gzip -6 test_data
xxd -c 4 test_data.gz
Output:
0000000: 1f8b 0808 ....
0000004: 453c 5d59 E<]Y
0000008: 0003 7465 ..te
000000c: 7374 5f64 st_d
0000010: 6174 6100 ata.
0000014: 0b49 2d2e .I-.
0000018: c9cc 4b57 ..KW
000001c: 28ca 4fce (.O.
0000020: 2eb6 d254 ...T
0000024: 7049 4d2a pIM*
0000028: 4d4f 0789 MO..
000002c: 1497 0245 ...E
0000030: 14ac 34b8 ..4.
0000034: 00f4 a724 ...$
0000038: 5623 0000 V#..
000003c: 00 .
As you can see, there is no 0x0A
at the end. I think Vim adds newlines to the end of files by default, if they are not present.
Upvotes: 2