Reputation: 12036
How do I correctly compress a string, so PHP would be able to decompress?
I tried this:
public static byte[] compress(String string) throws IOException {
ByteArrayOutputStream os = new ByteArrayOutputStream(string.length());
DeflaterOutputStream gos = new DeflaterOutputStream(os);
// ALSO TRIED GZOutputStream, same results!
gos.write(string.getBytes());
gos.close();
byte[] compressed = os.toByteArray();
os.close();
return compressed;
}
But PHP does not recognize output as valid GZip compressed string...
The problem seems to be in some headers / footers being added by Android...
For example when I compress something
word via PHP with gzcompress
I got similar results as with Android, but not similar enough, so PHP could read it:
something
(HEX DUMP):
Android: 1f8b08000000000000002bcecf4d2dc9c8cc4b0700
fb31da0909000000
PHP: 789c2bcecf4d2dc9c8cc4b0700
134703cf
The weirdest thing is that by changing GZOutputStream
to DeflaterOutputStream
it fixed the problem with something
word, but the problem still appears with longer strings...
PS. Removing heading 10 characters from Android generated data does not help at all.
EDIT: I tried to decompress it in PHP with:
gzdecode()
- this function does not exist in standard Debian PHP5
version.gzdecompress()
- does not workAnd some functions to emulate gzdecode()
from PHP site comments that don't really do much.
All above, with removing first 10 bytes and leaving them.
PS2. I tried every single solution from Stack Overflow, and other sources, and still nothing. It is not a duplicate.
EDIT2 (BINARY DUMP): Sample data generated with Android that can't be decomprssed by gzuncompress()
or pseudo-gzdecode()
functions from PHP.NET
: data.compressed.
It supposed to be some JSON, after decompression.
Upvotes: 2
Views: 767
Reputation: 112349
The Android data that starts with 1f8b
is a gzip stream. In php you use gzdecode()
for that. gzencode()
on php makes gzip streams.
The php data that starts with 789c
is a zlib stream. You used gzcompress()
to make that, and you would use gzuncompress()
to decode it.
The compressed data contained within both of those streams, starting with 2bce
is raw deflate data. You can use gzinflate()
to decode that if you happened to make it somewhere, and you can use gzdeflate()
to generate raw deflate.
Just to rant, gzencode()
, gzcompress()
, and gzdeflate()
are some of the most misleading function names ever concocted, since only one of them is related to gzip yet all start with gz
, and nothing in the name gzcompress()
indicates zlib.
Update:
The "EDIT2" data is, for some reason, doubly compressed. It was compressed first to the zlib format, and then that zlib stream was compressed to the gzip format. (Though gzip couldn't compress the already compressed data, so it's a little bigger.)
You should repair the problem that made it doubly compressed. Or if you have no control over that, you can doubly decompress it, first stripping the gzip header using the RFC 1952 specification and then gzinflate()
on the raw deflate data, and then using gzdecompress()
on the result.
Upvotes: 3