Python zlib not compressing string?

Question

I'm going through the python.org's python tutorial, at the moment. I'm on 10.9 and I am trying to use the zlib library to compress a string. However, the len(compressedString) isn't always less than the len(originalString). My interpreter code is below:

>>> import zlib
>>> s = 'the quick brown fox jumps over the lazy dog'
>>> len(s)
43
>>> t = zlib.compress(s)
>>> len(t)
50
>>> t
'x\x9c+\xc9HU(,\xcdL\xceVH*\xca/\xcfSH\xcb\xafP\xc8*\xcd-(V\xc8/K-R(\x01J\xe7$VU*\xa4\xe4\xa7\x03\x00a<\x0f\xfa'
>>> len(zlib.decompress(t))
43
>>> s2 = "something else i'm compressing"
>>> len(s2)
30
>>> t2 = zlib.compress(s2)
>>> len(t2)
37
>>> s3 = "witch which has which witches wrist watch"
>>> len(s3)
41
>>> t3 = zlib.compress(s3)
>>> len(t3)
37

Does anyone know why this is happening?

Martijn Pieters · Accepted Answer

The zlib compression algorithm is not always efficient:

>>> len(zlib.compress('ab'))
10

because it needs to add metadata (headers, symbol tables, backreferences) that could amount to more data than what you tried to compress. Use it on longer, not-so-random data and it'll compress things just fine:

>>> lorem = 'Neque porro quisquam est qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit'
>>> len(lorem) * 100
9100
>>> len(zlib.compress(lorem * 100))
123

Python zlib not compressing string?

Answers (2)

Related Questions