Sss
Sss

Reputation: 1529

How to find compression efficiency using Huffman encoding?

I have compressed a binary file using Huffman encoding. Now I am trying to find the compression efficiency.

In my binary file I have symbols (buch of 0 & 1) and frequency (repetition of symbols). Suppose I have:

Currently every symbol is encoded in UInt64, so size of file would be (173+50+48+45)*8=2528 bytes if my way of calculating the size is correct. Please correct me if I am wrong. On debugging I get 2536, 8 more I don't know why?

After compression I got encoding like this

Could some one please tell me how to get Huffman compression of this binary file using these information? I tried searching on Google but there is no sample of binary file they have some frequency of float type which I am not able to understand how to relate them with my Binary file.

Upvotes: 1

Views: 2972

Answers (2)

Andrew Polar
Andrew Polar

Reputation: 1

Based on provided frequencies the tree is wrong. It must be 0 10 110 111 It always ends on with all positive bits. Solutions, which are different than Huffman trees, may work but do not provide best compression.

Upvotes: 0

unwind
unwind

Reputation: 399793

You need to simply compute how many total bits you end up with:

173 * 1 + 50 * 2 + 48 * 3 + 45 * 3

That comes to 552 bits. Converting to whole bytes gives us 69 bytes. So that's a compression to 69/2528 or about 3% of the original, assuming the de-compressor can know the dictionary and so on. Also assuming your input symbols (0 to 3) are 64-bit values, for some reason.

Upvotes: 1

Related Questions