Reputation: 13908
Our products use the Greenleaf Archive Library, an old compression library for Windows. We're now looking to move to the mac, but I'm pretty sure the .lib files we got from Greenleaf won't work on that platform.
Other than just switching to another compression library, which would be problematic for several reasons, does anyone know of any alternatives, like an open source version of the library or a mac port?
Upvotes: 1
Views: 935
Reputation: 10796
Greenleaf Archive Lib has some source code that exists which is obfuscated into various functions. Today with modern IDEs you can do a pretty good job fixing this by refactoring things. I also spoke with Mark Nelson who worked with it and he notified that they had sublicensed the code from Robert Jung who wrote ARJ (Archive Robert Jung), since that time ARJ source code has been made available though I find no acceptable license for it, but I can confirm the archiving is the same. Except that Greenleaf has a smaller sliding window.
Compression Parameters
The HUS/VIP compression does seem to use the ARJ decode (1 to 3) with a dictionary size of 1024 instead of 26k.
Parameter Value
CODE_BIT 16
THRESHOLD 3
MAXMATCH 256
DICBIT 10
DICSIZE 1024
NT 19
TBIT 5
NP 15
PTABLESIZE 256
PBIT 5
NC 511
CTABLESIZE 4096
CBIT 9
NPT 19
https://community.kde.org/Projects/Liberty/File_Formats/Husqvarna_HUS
Now, the parameter DICSIZE in ARJ is 22k rather than 1k. But, also the value NC is 510 in ARJ rather than 511 as it is with Greenleaf. The decode compression parameter changes the dictionary size. Namely each level of extra compression goes up by a power of 2. It's actually setting the size by a bit shift.
I don't see any properly licensed stuff so you may have to rewrite the arj compression scheme from scratch. Although technical details are sparse, and the source code itself is weird. Apparently he got a now expired patent on the weird part where he's apparently doing some hash tables to do the Huffman encoding by building some tables elements in memory and using them to index find the next index. It appears to be an LZSS scheme using an optional providing of a dictionary. But, the actual workings of the algorithm is very much like like with zip's deflate scheme, it might actually be spot on for deflate. So if compiling some obfuscated source code you might not have isn't possible for some reason you can rewrite it.
I rewrote the decompression from scratch, and MIT licensed code:
https://github.com/EmbroidePy/pyembroidery/blob/master/pyembroidery/EmbCompress.py
This will also work for ARJ compression since I didn't care about the window size. It'll read any size window since it will just read the data in the file.
For the compression I don't want to do the massive legwork of writing a real scheme so I've developed a special way to cheat. We write a fake table of 256 8 bit long entries. They will natively number lowest first in a tie so they will exactly map to be equal to the character themselves.
Block writing.
16 bits filesize
- Writing Character_Length huffman.
5 bytes: 00000, we have 0 entries. 1 length. They are all 8.
5 bytes: 01010, all values are 10, meaning length 8.
---- CharLength huffman only returns 10.
- Writing Character Huffman.
9 bits: 100000000, 256 entries.
---- All character build with c_len value 10, which is 8. No bits are read.
---- Huffman table of 256 8 entries.
Distance Huffman. (we are never using this).
5 bits: 00000 No entries.
5 bit: 00000 Any value doesn't matter. Since we're never using it.
However!
45 % 8 = 5
16 + 10 + 9 + 10 = 45 bits.
And we'd have to offset all of our bytes by 3 bits.
So we need a huffman table bit length that is exactly mod 8.
5 bit: 00001 1 distance huffman.
3 bit: 111 we have a variable length value 7 or more.
5 bit: 11110 we add 4 to this 7 for 11. Because really we want to pad bits.
(16) + (10) + (9) + (5 + 3 + 5) = 48 bits. 6 bytes.
2 bytes filesize. Followed by 4 bytes:
0b00000010101000000000000111111110
This is storable in a single integer:
0x02A001FE
So to perform Greenleaf or ARJ compression.
2-byte-block-size + 0x02A001FE + uncompressed_stream
If you need more than 2^16 bytes, you'll need more blocks. You make them the same way.
Upvotes: 0
Reputation: 872
The 2.12 version of ArchiveLib we've got here shipped with full source code and builds fairly painlessly under various compilers. Check your original install package and see if it has an option to install the source.
Edit (much later): As I mentioned in the comments, there was one missing method in the source code. As the link I put there seems to have vanished, I've added the code below.
//
// void ALStorage::YieldTime()
//
// ARGUMENTS:
//
// None.
//
// RETURNS
//
// Nothing.
//
// DESCRIPTION
//
// This function has two important things to do. It gets called
// at a few different points in the process of reading or writing data
// from storage objects. During normal reading and writing, it
// will get called every time the buffer is loaded or flushed.
//
// If we are in Windows mode, we execute a PeekMessage() loop. This
// makes sure that we aren't hogging the CPU. By doing it this way,
// the programmer can be ensure that he/she is being a good citizen
// without any significant effort.
//
// The second important function is that of calling the monitor function.
// The user interface elements need to be updated regularly, and this
// is done via this call.
//
// REVISION HISTORY
//
// May 26, 1994 1.0A : First release
//
void AL_PROTO ALStorage::YieldTime()
{
if ( mpMonitor )
mpMonitor->Progress( Tell(), *this );
/*
* For right now I am going to put the PeekMessage loop in the load
* buffer routine by default. Most Windows applications are going
* to want to use this, right?
*/
#if defined( AL_WINDOWS_GUI )
MSG msg;
while ( PeekMessage( &msg, NULL, 0, 0, PM_REMOVE ) ) {
TranslateMessage( &msg );
DispatchMessage(&msg);
}
#endif
}
Upvotes: 2