David Nagl
David Nagl

Reputation: 79

C# GZipStream compressing data but decompression returns empty stream

I have the following code:

    public static async Task<string> Compress(string inputString)
    {
        var bytes = Encoding.Unicode.GetBytes(inputString);
        await using var input = new MemoryStream(bytes);
        await using var output = new MemoryStream();
        await using var stream = new GZipStream(output, CompressionLevel.SmallestSize);

        await input.CopyToAsync(stream);

        return Convert.ToBase64String(output.ToArray());
    }

    public static async Task<string> Decompress(string inputString)
    {
        var bytes = Convert.FromBase64String(inputString);

        await using var output = new MemoryStream();
        await using var input = new MemoryStream(bytes);
        await using var stream = new GZipStream(input, CompressionMode.Decompress);

        await stream.CopyToAsync(output);
        await stream.FlushAsync();
        
        return Encoding.Unicode.GetString(output.ToArray());
    }

When I try to compress the string 'Hello World', the compressed Base64 encoded string is 'H4sIAAAAAAACCg=='

When I try to decompress the Base64 encoded string 'H4sIAAAAAAACCg==' the method Decompress returns an empty string.

Upvotes: 1

Views: 609

Answers (2)

Gerald Mayr
Gerald Mayr

Reputation: 729

The GZipStream respectively its internal DeflateStream is designed to implicitly flush its internal buffer on a call to Dispose (see https://github.com/dotnet/runtime/blob/main/src/libraries/System.IO.Compression/src/System/IO/Compression/DeflateZLib/DeflateStream.cs).

For using statements without curly braces, which have been introduced with C# 8.0, the scope is limited by the containing scope. In this case, the containing scope is the method itself, which means that the object goes out of scope when the method finally exits. Therefore, the implicit flush from stream to output happens after Convert.ToBase64String(output.ToArray()).

To avoid this, we can add curly braces to limit the scope (see example below), or by flushing the stream explicitly (this is not valid for all .NET versions, see https://github.com/dotnet/runtime/commit/728aa671567d498c1acb6e13cb5cf4f7a883acf7 .

public static async Task<string> Compress(string inputString)
{
    var bytes = Encoding.Unicode.GetBytes(inputString);
    
    await using (var input = new MemoryStream(bytes))
    {
        await using (var output = new MemoryStream())
        {
            await using (var stream = new GZipStream(output, CompressionLevel.SmallestSize))
            {
                await input.CopyToAsync(stream);
            }
        
            return Convert.ToBase64String(output.ToArray());
        }
    }
}

public static async Task<string> Decompress(string inputString)
{
    var bytes = Convert.FromBase64String(inputString);

    await using (var output = new MemoryStream())
    {
        await using (var input = new MemoryStream(bytes))
        {
            await using (var stream = new GZipStream(input, CompressionMode.Decompress))
            {
                await stream.CopyToAsync(output);
            }
        }

        return Encoding.Unicode.GetString(output.ToArray());
    }
}

Upvotes: 4

Sohaib Jundi
Sohaib Jundi

Reputation: 1664

You are not getting the correct compressed string. output.ToArray() is being called before stream GZipStream is flushed, so the contents of output MemoryStream would not be the full compressed bytes yet. You need to add await stream.FlushAsync(); after await input.CopyToAsync(stream);.

Upvotes: 4

Related Questions