Reputation: 1579
I am attempting to compress a large amount of data, sometimes in the region of 100GB, when i run the routine i have written it appears the file comes out exactly the same size as the previous size. Has anyone else had this issue with the GZipStream?
My code is as follows:
byte[] buffer = BitConverter.GetBytes(StreamSize);
FileStream LocalUnCompressedFS = File.OpenWrite(ldiFileName);
LocalUnCompressedFS.Write(buffer, 0, buffer.Length);
GZipStream LocalFS = new GZipStream(LocalUnCompressedFS, CompressionMode.Compress);
buffer = new byte[WriteBlock];
UInt64 WrittenBytes = 0;
while (WrittenBytes + WriteBlock < StreamSize)
{
fromStream.Read(buffer, 0, (int)WriteBlock);
LocalFS.Write(buffer, 0, (int)WriteBlock);
WrittenBytes += WriteBlock;
OnLDIFileProgress(WrittenBytes, StreamSize);
if (Cancel)
break;
}
if (!Cancel)
{
double bytesleft = StreamSize - WrittenBytes;
fromStream.Read(buffer, 0, (int)bytesleft);
LocalFS.Write(buffer, 0, (int)bytesleft);
WrittenBytes += (uint)bytesleft;
OnLDIFileProgress(WrittenBytes, StreamSize);
}
LocalFS.Close();
fromStream.Close();
The StreamSize is an 8 byte UInt64 value that holds the size of the file. i write these 8 bytes raw to the start of the file so i know the original file size. Writeblock has the value of 32kb (32768 bytes). fromStream is the stream to take data from, in this instance, a FileStream. Is the 8 bytes infront of the compressed data going to cause an issue?
Upvotes: 6
Views: 9916
Reputation: 82316
Austin Salonen's code doesn't work for me (buggy, 4GB error).
Here's the proper way:
using System;
using System.Collections.Generic;
using System.Text;
namespace CompressFile
{
class Program
{
static void Main(string[] args)
{
string FileToCompress = @"D:\Program Files (x86)\msvc\wkhtmltopdf64\bin\wkhtmltox64.dll";
FileToCompress = @"D:\Program Files (x86)\msvc\wkhtmltopdf32\bin\wkhtmltox32.dll";
string CompressedFile = System.IO.Path.Combine(
System.IO.Path.GetDirectoryName(FileToCompress)
,System.IO.Path.GetFileName(FileToCompress) + ".gz"
);
CompressFile(FileToCompress, CompressedFile);
// CompressFile_AllInOne(FileToCompress, CompressedFile);
Console.WriteLine(Environment.NewLine);
Console.WriteLine(" --- Press any key to continue --- ");
Console.ReadKey();
} // End Sub Main
public static void CompressFile(string FileToCompress, string CompressedFile)
{
//byte[] buffer = new byte[1024 * 1024 * 64];
byte[] buffer = new byte[1024 * 1024]; // 1MB
using (System.IO.FileStream sourceFile = System.IO.File.OpenRead(FileToCompress))
{
using (System.IO.FileStream destinationFile = System.IO.File.Create(CompressedFile))
{
using (System.IO.Compression.GZipStream output = new System.IO.Compression.GZipStream(destinationFile,
System.IO.Compression.CompressionMode.Compress))
{
int bytesRead = 0;
while (bytesRead < sourceFile.Length)
{
int ReadLength = sourceFile.Read(buffer, 0, buffer.Length);
output.Write(buffer, 0, ReadLength);
output.Flush();
bytesRead += ReadLength;
} // Whend
destinationFile.Flush();
} // End Using System.IO.Compression.GZipStream output
destinationFile.Close();
} // End Using System.IO.FileStream destinationFile
// Close the files.
sourceFile.Close();
} // End Using System.IO.FileStream sourceFile
} // End Sub CompressFile
public static void CompressFile_AllInOne(string FileToCompress, string CompressedFile)
{
using (System.IO.FileStream sourceFile = System.IO.File.OpenRead(FileToCompress))
{
using (System.IO.FileStream destinationFile = System.IO.File.Create(CompressedFile))
{
byte[] buffer = new byte[sourceFile.Length];
sourceFile.Read(buffer, 0, buffer.Length);
using (System.IO.Compression.GZipStream output = new System.IO.Compression.GZipStream(destinationFile,
System.IO.Compression.CompressionMode.Compress))
{
output.Write(buffer, 0, buffer.Length);
output.Flush();
destinationFile.Flush();
} // End Using System.IO.Compression.GZipStream output
// Close the files.
destinationFile.Close();
} // End Using System.IO.FileStream destinationFile
sourceFile.Close();
} // End Using System.IO.FileStream sourceFile
} // End Sub CompressFile
} // End Class Program
} // End Namespace CompressFile
Upvotes: 1
Reputation: 50235
I ran a test using the following code for compression and it ran without issue on a 7GB and 12GB file (both known beforehand to compress "well"). Does this version work for you?
const string toCompress = @"input.file";
var buffer = new byte[1024*1024*64];
using(var compressing = new GZipStream(File.OpenWrite(@"output.gz"), CompressionMode.Compress))
using(var file = File.OpenRead(toCompress))
{
var bytesRead = 0;
while(bytesRead < buffer.Length)
{
bytesRead = file.Read(buffer, 0, buffer.Length);
compressing.Write(buffer, 0, buffer.Length);
}
}
Have you checked out the documentation?
The GZipStream class cannot decompress data that results in over 8 GB of uncompressed data.
You probably need to find a different library that will support your needs or attempt to break your data up into <=8GB
chunks that can safely be "sewn" back together.
Upvotes: 5