Reputation: 3677
I have to decompress some gzip text in .NET 6 app, however, on a string that is 20,627 characters long, it only decompresses about 1/3 of it. The code I am using code works for this string in .NET 5 or .NETCore 3.1 As well as smaller compressed strings.
public static string Decompress(this string compressedText)
{
var gZipBuffer = Convert.FromBase64String(compressedText);
using var memoryStream = new MemoryStream();
int dataLength = BitConverter.ToInt32(gZipBuffer, 0);
memoryStream.Write(gZipBuffer, 4, gZipBuffer.Length - 4);
var buffer = new byte[dataLength];
memoryStream.Position = 0;
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
{
gZipStream.Read(buffer, 0, buffer.Length);
}
return Encoding.UTF8.GetString(buffer);
}
The results look something like this:
Star of amazing text..... ...Text is fine till 33,619 after that is allNULLNULLNULLNULL
The rest of the file after the 33,618 characters is just nulls.
I have no idea why this is happening.
Edit: I updated this when I found the issue was not Blazor but in fact .NET 6. I took a project that was working in .NET Core 3.1 changed nothing other than compiling for .NET 6 and got the same error. The update reflects this.
Edit2: Just tested and it works in .NET 5 so it just .NET 6 that this error happens in.
Upvotes: 17
Views: 4040
Reputation: 48240
Just confirmed that the article linked in the comments below the question contains a valid clue on the issue.
Corrected code would be:
string Decompress(string compressedText)
{
var gZipBuffer = Convert.FromBase64String(compressedText);
using var memoryStream = new MemoryStream();
int dataLength = BitConverter.ToInt32(gZipBuffer, 0);
memoryStream.Write(gZipBuffer, 4, gZipBuffer.Length - 4);
var buffer = new byte[dataLength];
memoryStream.Position = 0;
using var gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress);
int totalRead = 0;
while (totalRead < buffer.Length)
{
int bytesRead = gZipStream.Read(buffer, totalRead, buffer.Length - totalRead);
if (bytesRead == 0) break;
totalRead += bytesRead;
}
return Encoding.UTF8.GetString(buffer);
}
This approach changes
gZipStream.Read(buffer, 0, buffer.Length);
to
int totalRead = 0;
while (totalRead < buffer.Length)
{
int bytesRead = gZipStream.Read(buffer, totalRead, buffer.Length - totalRead);
if (bytesRead == 0) break;
totalRead += bytesRead;
}
which takes the Read
's return value into account correctly.
Without the change, the issue is easily repeatable on any string random enough to produce a gzip of length > ~10kb.
Here's the compressor, if anyone's interested in testing this on your own
string Compress(string plainText)
{
var buffer = Encoding.UTF8.GetBytes(plainText);
using var memoryStream = new MemoryStream();
var lengthBytes = BitConverter.GetBytes((int)buffer.Length);
memoryStream.Write(lengthBytes, 0, lengthBytes.Length);
using var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress);
gZipStream.Write(buffer, 0, buffer.Length);
gZipStream.Flush();
var gZipBuffer = memoryStream.ToArray();
return Convert.ToBase64String(gZipBuffer);
}
Upvotes: 26