Is it practical to concatenate GZipStreams?

Question

I conceived this idea to merge an arbitrary number of small text file into 1 single zip file with GZipStream class. I spent several nights to make it work, but the outcome is that the final zip file ended up being bigger than if the text files had concatenated together. I vaguely know how Huffman coding works, so I don't know if it's practical to do this, or if there's a better alternative. Ultimately, I want an external sorted index file to map out each blob for fast access. What do you think?

// keep track of index current position 
long indexByteOffset = 0;
// in reality the blobs vary in size from 1k to 300k bytes
string[] originalData = { "data blob1", "data blob2", "data blob3", "data blob4" /* etc etc etc */};
// merged compressed file
BinaryWriter zipWriter = new BinaryWriter(File.Create(@"c:	emp\merged.gz"));
// keep track of begining position and size of each blob
StreamWriter indexWriter = new StreamWriter(File.Create(@"c:	emp\index.txt")); 
foreach(var blob in originalData){
    using(MemoryStream ms = new MemoryStream()){
        using(GZipStream zipper = new GZipStream(ms, CompressionMode.Compress)){
            Encoding utf8Encoder = new UTF8Encoding();
            byte[] encodeBuffer = utf8Encoder.GetBytes(blob);
            zipper.Write(encodeBuffer, 0, encodeBuffer.Length);
        }
        byte[] compressedData = ms.ToArray();
        zipWriter.Write(compressedData);
        zipWriter.Seek(0, SeekOrigin.End);
        indexWriter.WriteLine(indexByteOffset + '	' + (indexByteOffset + compressedData.Length));
        indexByteOffset += compressedData.Length;
    }
}

Is it practical to concatenate GZipStreams?

Answers (1)

Related Questions