MarkusParker
MarkusParker

Reputation: 1596

Avoid copying compressed data when using DeflateStream

Assume we have given an API function f(Stream s) to put binary data contained in a stream into a database. I want to put a file into the database using f but I want to compress the data in advance. Hence I thought I could do the following:

var fileStream= File.OpenRead(path);
using(var dstream = new DeflateStream(fileStream, CompressionLevel.Optimal))
   f(dstream);

But it seems DeflateStream only writes into the stream fileStream but does not read from it when compressing. In all examples I found, the CopyTo method of the stream was used to compress or decompress. But this would mean that I have to keep a copy of the compressed data in memory before passing it to f for instance like this:

var memoryStream = new MemoryStream();
using(var fileStream= File.OpenRead(path)) 
  using(var dstream = new DeflateStream(memoryStream, CompressionLevel.Optimal)) {
    fileStream.CopyTo(dstream);
    memoryStream.Seek(0, SeekOrigin.Begin);
    f(memoryStream);
  }    

Is there any way to avoid using the MemoryStream?

Update For the sake of the persistency of some commentators I add a complete example:

using System;
using System.IO;
using System.IO.Compression;

public class ThisWouldBeTheDatabaseClient {
  public void f(Stream s) {
    // some implementation I don't have access to
    // The only thing I know is that it reads data from the stream in some way.
    var buffer = new byte[10];
    s.Read(buffer,0,10);
  }
}

public class Program {
  public static void Main() {
    var dummyDatabaseClient = new ThisWouldBeTheDatabaseClient();
    var dataBuffer = new byte[1000];
    var fileStream= new MemoryStream( dataBuffer ); // would be "File.OpenRead(path)" in real case
    using(var dstream = new DeflateStream(fileStream, CompressionLevel.Optimal))
        dummyDatabaseClient.f(dstream);
  }
}

The read operation in the dummy implementation of f throws an exception: InvalidOperationException: Reading from the compression stream is not supported. Concluding the discussion in the comments, I assume that the desired behaviour is not possible with DeflateStream but there are alternatives in third party libraries.

Upvotes: 0

Views: 1946

Answers (2)

Sir Rufo
Sir Rufo

Reputation: 19106

The DeflateStream is just a wrapper and needs a stream for the compressed data. So you have to use two streams.

Is there any way to avoid using the MemoryStream?

Yes.

You need a stream to store temporary data without consuming (too much) memory. Instead using MemoryStream you can use a temporary file for that.

For the lazy people (like me in first place) let's create a class that will behave mostly like a MemoryStream

public class TempFileStream : FileStream
{
    public TempFileStream() : base(
        path: Path.Combine(Path.GetTempPath(), Path.GetRandomFileName()),
        mode: FileMode.OpenOrCreate,
        access: FileAccess.ReadWrite,
        share: FileShare.None,
        bufferSize: 4096,
        options: FileOptions.DeleteOnClose | FileOptions.Asynchronous | FileOptions.Encrypted | FileOptions.RandomAccess)
    {
    }
}

The important part here is FileOptions.DeleteOnClose which will remove the temporary file when you dispose the stream.

And then use it

using (var compressedStream = new TempFileStream())
{
    using (var deflateStream = new DeflateStream(
        stream: compressedStream,
        compressionLevel: CompressionLevel.Optimal,
        leaveOpen: true))
    using (var fileStream = File.OpenRead(path))
    {
        fileStream.CopyTo(deflateStream);
    }

    f(compressedStream);
}

Upvotes: 3

Jon Skeet
Jon Skeet

Reputation: 1503030

You can use SharpCompress for this. Its DeflateStream allows you to read the compressed data on the fly, which is exactly what you want.

Here's a complete example based on Sir Rufo's:

using System;
using System.IO;
using SharpCompress.Compressors;
using SharpCompress.Compressors.Deflate;
using System.Linq;

public class Program
{
    public static void Main()
    {
        var dataBuffer = Enumerable.Range(1, 50000).Select(e => (byte)(e % 256)).ToArray();

        using (var dataStream = new MemoryStream(dataBuffer))
        {
            // Note: this refers to SharpCompress.Compressors.Deflate.DeflateStream                
            using (var deflateStream = new DeflateStream(dataStream, CompressionMode.Compress))
            {
                ConsumeStream(deflateStream);
            }
        }
    }

    public static void ConsumeStream(Stream stream)
    {
        // Let's just prove we can reinflate to the original data...
        byte[] data;
        using (var decompressed = new MemoryStream())
        {
            using (var decompressor = new DeflateStream(stream, CompressionMode.Decompress))
            {
                decompressor.CopyTo(decompressed);
            }
            data = decompressed.ToArray();
        }
        Console.WriteLine("Reinflated size: " + data.Length);
        int errors = 0;
        for (int i = 0; i < data.Length; i++)
        {
            if (data[i] != (i + 1) % 256)
            {
                errors++;
            }
        }
        Console.WriteLine("Total errors: " + errors);
    }
}

Or using your sample code:

using System;
using System.IO;
using SharpCompress.Compressors;
using SharpCompress.Compressors.Deflate;

public class ThisWouldBeTheDatabaseClient {
  public void f(Stream s) {
    // some implementation I don't have access to
    // The only thing I know is that it reads data from the stream in some way.
    var buffer = new byte[10];
    s.Read(buffer,0,10);
  }
}

public class Program {
  public static void Main() {
    var dummyDatabaseClient = new ThisWouldBeTheDatabaseClient();
    var dataBuffer = new byte[1000];
    var fileStream= new MemoryStream( dataBuffer ); // would be "File.OpenRead(path)" in real case
    using(var dstream = new DeflateStream(
        fileStream, CompressionMode.Compress, CompressionLevel.BestCompression))
        dummyDatabaseClient.f(dstream);
  }
}

This now doesn't throw an exception, and will serve the compressed data.

Upvotes: 2

Related Questions