Reputation: 21
I have a numeric vector which contain 10000 rows and most of the value is zero. In this situation, which one to choose as the lossless compression algorithm between Arithmetic and Huffman? Thanks in advance!
Upvotes: 0
Views: 839
Reputation: 9007
What implementation of Arithmetic and Huffman? Are you allowed to look at the vector as a stream of bits and select the word size, or are you compressing the actual values in the vector. The smart money is usually with Arithmetic coding for compression and Huffman for performance, but the devil is in the details.
If the vector is truly sparse RLE will give you close to optimal results with very little overhead:
value, number of zeroes to next value, value, number of zeroes to next value
Upvotes: 1