Gilbert Williams
Gilbert Williams

Reputation: 1050

Copying GZipStream to file

I want to decompress a file that was uploaded encoded with gzip to S3 straight to a file stream.

Here is my method that returns the gzip stream after decompressing the S3 stream:

using var stream = await _s3.GetObjectStreamAsync(_processServiceOptions.BucketName, key, null);
using var gzipStream = new GZipStream(stream, CompressionMode.Decompress, true);
await WriteToFileAsync(gzipStream);

I'm trying to use it like so to copy it directly to the file stream, instead of loading it into memory using another stream...

async Task WriteToFileAsync(Stream data)
{
    using (var fs = File.OpenWrite(path))
    {
        await data.CopyToAsync(fs);
    }
}

However I'm getting System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.

Why is that?

Upvotes: 0

Views: 452

Answers (1)

CaseyHofland
CaseyHofland

Reputation: 538

For anyone still looking for an answer (I know I did), I opted for a different solution. I decided to make a generic compressor that can be used anywhere you need a byte[] compressed. You are doing a little more work because you are reading and writing from a stream twice, and optimizing that is left as an exercise for the reader, but compressing generically is a really elegant solution altogether.

Here's one for GZip:

public class GZipCompressor : ICompressor
{
    public async Task<byte[]> CompressAsync(byte[] data, CancellationToken cancellationToken = default)
    {
        using var uncompressedStream = new MemoryStream(data);
        using var compressedStream = new MemoryStream();
        using (var compressorStream = new GZipStream(compressedStream, CompressionMode.Compress))
        {
            await uncompressedStream.CopyToAsync(compressorStream, cancellationToken);
        }

        return compressedStream.ToArray();
    }

    public async Task<byte[]> DecompressAsync(byte[] compressedData, CancellationToken cancellationToken = default)
    {
        using var compressedStream = new MemoryStream(compressedData);
        using var decompressorStream = new GZipStream(compressedStream, CompressionMode.Decompress);
        using var decompressedStream = new MemoryStream();
        await decompressorStream.CopyToAsync(decompressedStream, cancellationToken);

        return decompressedStream.ToArray();
    }
}

And here's one for Deflate:

public class DeflateCompressor: ICompressor
{
    public async Task<byte[]> CompressAsync(byte[] data, CancellationToken cancellationToken = default)
    {
        using var uncompressedStream = new MemoryStream(data);
        using var compressedStream = new MemoryStream();
        using (var compressorStream = new DeflateStream(compressedStream, CompressionMode.Compress))
        {
            await uncompressedStream.CopyToAsync(compressorStream, cancellationToken);
        }

        return compressedStream.ToArray();
    }

    public async Task<byte[]> DecompressAsync(byte[] compressedData, CancellationToken cancellationToken = default)
    {
        using var compressedStream = new MemoryStream(compressedData);
        using var decompressorStream = new DeflateStream(compressedStream, CompressionMode.Decompress);
        using var decompressedStream = new MemoryStream();
        await decompressorStream.CopyToAsync(decompressedStream, cancellationToken);

        return decompressedStream.ToArray();
    }
}

Now you just call this before sending your data to AWS and you're golden.

Upvotes: 0

Related Questions