Reputation: 43
I need to write a program to compress/decompress txt files using Huffman algotrithm
I have writen it, and it works good for files that have less charachters than the buffer size, but it doesnt work for files with greater number of characters.
My problem is to interface compression buffer with decompression buffer.
So if the number of bytes written by the compression (which contains the 1 and 0 to go through the tree), is different from the number of bytes the decompression reads it does not work. Example, if the buffer of the compression writes 200, I need the buffer of decompression to read exactly 200 bytes.
If i set the size of decompression to read 200, somewhere the compression will write 200 and other times less or more than 200.
Can you suggest anything how to keep track of the numbers of byte written by compression each time and transmit it to decompression part?
Upvotes: 0
Views: 1352
Reputation: 13948
A common way to "track" the end of stream is to add an N+1 "EOF" symbol specifically for this usage. This way, you don't need to maintain any "size" counter.
Upvotes: 2
Reputation: 2481
I did't use any buffers. In header of my file I write down code length, and code itself. So when I want to decompress my file, first I read code lengths and codes from my header (you can also put few bytes in header to check correctness of file: for example XXY, so if file does not start with these bytes, its corrupted). After I read my code lengths, and my codes, it is time to decode rest of data. You can decode it in this way:
int data=0,dataLength=0;
while (input.read((char*)&sign, sizeof sign)) {
data = (data << 8) + sign;
dataLength += 8;
for (int i=0; i<256; i++) {
if (dataLengthFromHeader[i]==0)
continue;
if (dataLength>=dataLengthFromHeader[i] && codesFromHeader[i] == data >> (dataLength-dataLengthFromHeader[i])) {
unsigned char code = i;
izlaz.write((char*)&code, sizeof code);
dataLength -= dataLengthFromHeader[i];
data = data - (codesFromHeader[i] << dataLength);
if (dataLength==0) break;
i=0;
}
}
}
Upvotes: 0