angryInsomniac
angryInsomniac

Reputation: 859

Writing the huffman tree to file after compression

I'm trying to write a Huffman tree to the compressed file after all the actual compressed file data has been inserted. But , i just realized a bit of a problem , suppose I decide that once all my actual data has been written to file , I will put in 2 linefeed characters and then write the tree. That means , when I read stuff back, those two linefeeds (or any character really) are my delimiters. The problem is , that its entirely possible that the actual data also has 2 linefeeds one after the other, in such a scenario, my delimiter check would fail. I've taken the example of two linefeeds here , but the same is true for any character string, I could subvert the problem by maybe taking a longer string as the delimiter , but that would have two undersirable effects: 1. There is still a remote chance that the long string is by some coincidence present in the compressed data. 2. Un-necessarily bloating a file which needs to be compressed.

Does anyone have any suggestions on how to separate the compressed data from the tree data ?

Upvotes: 2

Views: 1207

Answers (2)

Tiger-222
Tiger-222

Reputation: 7150

Why not write the size and len on the first 8 bytes (4 each) and then the data? Then something like:

uint32_t compressed_size;
uint32_t data_len;
char * data;

file.read((char*)compressed_size, 4);
file.read((char*)data_len, 4);
data = new char[data_len];
zip.read(data, data_len);

Should work. You could deflate the data for better compression.

Upvotes: 0

user1071136
user1071136

Reputation: 15725

First, write the size of the tree in bytes. Then, write the tree itself, and then the contents itself.

When reading, first read the size, then the tree (now you know how many characters to read), and then the contents.

The size can be written as a string, ending with a line feed - this way, you know that the first number and line feeds belong to the size of the tree.

Upvotes: 3

Related Questions