Jimmy D
Jimmy D

Reputation: 5376

Compressed JSON is bigger than non-compressed version

I'll try to clear up my question.

myJSON is a simple JSON string. len(myJSON) = 78

e is json.Marshal(myJSON)

From what I understand, e is now a []byte

Then I gzip e like this:

var buf bytes.Buffer
gz := gzip.NewWriter(&buf)
gz.Write(e)
gz.Close()

And buf.Len() = 96

So... why is my compressed buffer bigger than the original non-compressed string?

Edit: It's hilarious the trolls that down vote a question when someone is trying to understand WHY something is happening. Guess I should just blindly accept it and not ask.

Upvotes: 0

Views: 2580

Answers (2)

James Henstridge
James Henstridge

Reputation: 43949

It is physically impossible to design a lossless compression algorithm that will reduce the size of every input document.

As a thought experiment, imagine that such a compressor existed and could compress any document by at least one bit.

Now lets say that I generate every document that is at most N bits long. That is 1 document of length 0, 2 of length 1, 4 of length 2, etc. This sequence works out to 2^(N+1)-1 total documents.

If we run all the documents through the compressor, the compressed versions will all be at most N-1 bits long. That means there can be at most 2^N-1 compressed documents, which is fewer than we started with. Either the compression system is lossy (in which case decompression won't necessarily give us the original document), or some documents must grow in size when compressed.

Upvotes: 10

pixeloverflow
pixeloverflow

Reputation: 579

gzip will add a header and make some changes to the original data. For the case, the original data is really small it will not guarantee compressed data will smaller than original data.

So if your program will constantly deal with the small data like this. Compress data use compress library may not a good idea. Some time we serialize the data into binary stream for the case that data is constantly small.

Go gzip package ref:

Package gzip implements reading and writing of gzip format compressed files, as specified in RFC 1952.

RFC1952

gzip format and header:

http://www.onicos.com/staff/iz/formats/gzip.html

Upvotes: 5

Related Questions