Reputation: 59
I'm trying to build a program for file compressing. so far I implemented the Huffman coding algorithm but I noticed it's not enough - the compressing is minimal, and I can compress only a few millions of bits and usually it only 1% of the original file. I searched for information about it and I found out that most of compressing file programs like bzip2 and gzip are using the combination of the LZW and Huffman algorithms together. When I tried to use the LZW algorithm I got stuck about how to do it on binary, on bits. Most of the examples and explanations about this algorithm are examing on alphabetical strings and some limited-inadequately information on binary. Is there a full clear guide for how to implement it on binary or a way to understand it simplly?
Upvotes: 1
Views: 1823
Reputation: 112374
Neither uses LZW. gzip uses LZ77, which finds matching strings in previous data. The literals and length/distance pairs are then sent using Huffman codes. bzip2 uses a Burrows-Wheeler transform, followed by a move-to-front, run-length encoding, and Huffman coding.
Upvotes: 1