Which algorithm is most suitable for Burrows-Wheeler Transform?

It seems many compressors implementing the BWT use it in conjunction with arithmetic coding, or Huffman coding. (Feel free to nominate more, especially if they're better.)

My first question is: why would a dictionary coder, such as LZW or LZSS, be a worse choice to use with the BWT?

My second question is: which is the best all-round algorithm to use?

Upvotes: 0

Views: 1072

Answers (1)

richselian
richselian

Reputation: 731

  1. BWT uses all sizes of contexts, while a practical LZ implementation can hardly use contexts of size shorter then 3.
  2. BWT benefits from every matches inside a block, while normal LZ implementation only find matches in the look-forward window.

But LZ is not a worse choise in many situations. LZ is an on-line algorithm and can work on streams, while BWT must work on a large block and cost lots of memory. LZ's decompression is very effecient while BWT still needs at least 5n extra space.

BWT's performance is related to suffix sorting. There are many practical suffix sorting algorithms: MSufSort/DivSufSort/Archon/QSufSort/DeepSwallow, and theoretically optimal algorithms with O(n) time: SA-IS/SA-DS.

PS/ If you are writing a BWT based compressor, pay more attention of encoding BWT's output but not BWT itself, since there are many practical libs for BWT transform and most of them share the same interface. just use one of them in your project.

Upvotes: 2

Related Questions