Reputation: 7213
I'm wondering about why I need to cut off the last 4 Characters, after using gzcompress().
Here is my code:
header("Content-Encoding: gzip");
echo "\x1f\x8b\x08\x00\x00\x00\x00\x00";
$index = $smarty->fetch("design/templates/main.htm") ."\n<!-- Compressed by gzip -->";
$this->content_size = strlen($index);
$this->content_crc = crc32($index);
$index = gzcompress($index, 9);
$index = substr($index, 0, strlen($index) - 4); // Why cut off ??
echo $index;
echo pack('V', $this->content_crc) . pack('V', $this->content_size);
When I don't cut of the last 4 chars, the source ends like:
[...]
<!-- Compressed by gzip -->N
When I cut them off it reads:
[...]
<!-- Compressed by gzip -->
I could see the additional N only in Chromes Code inspector (not in Firefox and not in IEs source). But there seams to be four additional characters at the end of the code.
Can anyone explain me, why I need to cut off 4 chars?
Upvotes: 3
Views: 4955
Reputation: 655129
gzcompress
implements the ZLIB compressed data format that has the following structure:
0 1
+---+---+
|CMF|FLG| (more-->)
+---+---+
(if FLG.FDICT set)
0 1 2 3
+---+---+---+---+
| DICTID | (more-->)
+---+---+---+---+
+=====================+---+---+---+---+
|...compressed data...| ADLER32 |
+=====================+---+---+---+---+
Here you see that the last four bytes is a Adler-32 checksum.
In contrast to that, the GZIP file format is a list of of so called members with the following structure:
+---+---+---+---+---+---+---+---+---+---+
|ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->)
+---+---+---+---+---+---+---+---+---+---+
(if FLG.FEXTRA set)
+---+---+=================================+
| XLEN |...XLEN bytes of "extra field"...| (more-->)
+---+---+=================================+
(if FLG.FNAME set)
+=========================================+
|...original file name, zero-terminated...| (more-->)
+=========================================+
(if FLG.FCOMMENT set)
+===================================+
|...file comment, zero-terminated...| (more-->)
+===================================+
(if FLG.FHCRC set)
+---+---+
| CRC16 |
+---+---+
+=======================+
|...compressed blocks...| (more-->)
+=======================+
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
| CRC32 | ISIZE |
+---+---+---+---+---+---+---+---+
As you can see, GZIP uses a CRC-32 checksum for the integrity check.
So to analyze your code:
echo "\x1f\x8b\x08\x00\x00\x00\x00\x00";
– puts out the following header fields:
echo $index;
– puts out compressed data according to the DEFLATE data compression formatecho pack('V', $this->content_crc) . pack('V', $this->content_size);
– puts out the CRC-32 checksum and the size of the uncompressed input data in binaryUpvotes: 8
Reputation: 229058
gzcompress produces output described here RFC1950 , the last 4 bytes you're chopping off is the adler32 checksum. This is the "deflate" encoding, so you should just set "Content-Encoding: deflate" and not manipulate anything.
If you want to use gzip, use gzencode()
, which uses the gzip format.
Upvotes: 2