Matthew1471
Matthew1471

Reputation: 248

Quickest Way To Decompress BIG .tar.gz In C#?

I have a load of .tar.gz files that are around 5GB. I have noticed that the .NET GZipStream actually gets stuck in an infinite loop trying to decompress them.

I found some pure C# code but these all had issues with the size of my files. Unlike other posters (24GB tar.gz Decompress using sharpziplib) I am compiling the application as a 64 bit .NET 4.5.1 application on an X64 bit machine.

I noticed that .NET 4.5.1 removes the 2GB limit.. but after reading it found it to be quite misleading, it appears actually it removes all the nested parts of an object not being able to use more than 2GB but the actual addressable range for objects such as byte arrays still appears to be 2GB even with the relevant option turned on

Does anyone have any solutions or have I hit a limitation in C#? I can invoke the 64bit 7ZIP DLL from my app or call the 7ZIP .exe and wait for it to finish (bit of a bodge) but there has to be a cleaner way? Also I want the quickest decompression and preferably something in pure C# code but I'm currently left thinking this is not possible in C# (due to limitations on the addressable range of byte arrays)

Upvotes: 1

Views: 1950

Answers (1)

Reed Copsey
Reed Copsey

Reputation: 564771

You won't be able to load the resulting data into a single byte[] in C#. You will still be limited by the array size.

However, you should be able to decompress these without issue by just using streams, and decompressing through a stream. I've had very good luck with DotNetZip and large streams - using it, you should be able to just do:

using (System.IO.Stream input = System.IO.File.OpenRead(inputFile))
using (Stream decompressor= new Ionic.Zlib.GZipStream(input, CompressionMode.Decompress, true))
using (var output = System.IO.File.Create(outputFile))
    decompressor.CopyTo(output);

Upvotes: 5

Related Questions