Bolpat
Bolpat

Reputation: 1697

Why is DeflateStream slow to read one by one?

I have a class that can read and write some data in a custom binary format. My (Windows Forms) application has to read a file with that format on startup. Because my format is rather redundant, I use a DeflateStream which does a great job at compressing the data.

However it is slow in my actual application (6500–7200 ms) compared to my functionality test (185–200 ms). The test method creates the exact same file I’m reading in my application and then, for testing, reads it back and compares results. Everything is about the reading is identical. And after toying around, I found that shoving the whole DeflateStream to a MemoryStream is a lot quicker (330–380 ms) than reading the DeflateStream directly.

Slow approach:

System.Diagnostics.Stopwatch sw = new();
Tally tally = new();
string fileName = "data.dat";
using FileStream file = new(fileName, FileMode.Open, FileAccess.Read, FileShare.Read);
sw.Start();
try
{
    using DeflateStream ds = new(file, CompressionMode.Decompress, leaveOpen: true);
    tally.Read(ds); // slow
}
finally
{
    sw.Stop();
    MessageBox.Show($"Reading (Count = {tally.Count}) took {sw.ElapsedMilliseconds} ms.");
}

Fast approach:

System.Diagnostics.Stopwatch sw = new();
Tally tally = new();
string fileName = "data.dat";
using FileStream file = new(fileName, FileMode.Open, FileAccess.Read, FileShare.Read);
sw.Start();
try
{
    using DeflateStream ds = new(file, CompressionMode.Decompress, leaveOpen: true);
    using MemoryStream ms = new();
    ds.CopyTo(ms);
    ms.Seek(0, SeekOrigin.Begin);
    tally.Read(ms); // quick
}
finally
{
    sw.Stop();
    MessageBox.Show($"Reading Tally (Count = {tally.Count}) took {sw.ElapsedMilliseconds} ms.");
}

The difference is staggering and I don’t know what I’m doing wrong.


SortedList<TallyKey, TallyValue> m_Results = new();

public void Read(Stream stream)
{
    ArgumentNullException.ThrowIfNull(stream);
    m_Results.Clear();

    {
        Span<byte> sizeBuffer = stackalloc byte[sizeof(int)];
        stream.ReadExactly(sizeBuffer);
        m_Results.Capacity = BinaryPrimitives.ReadInt32LittleEndian(sizeBuffer);
    }

    Span<byte> buffer = stackalloc byte[EntrySize];
    int bytesRead = 0;
    for (int n; (n = stream.Read(buffer[bytesRead..])) is not 0; )
    {
        if ((bytesRead += n) < EntrySize) continue;
        bytesRead = 0;
        // FromLittleEndianData creates TallyKey/TallyValue
        // via several BinaryPrimitives reads;
        // each is a small struct (5 ints // TimeSpan + int).
        // TallyKey.SerializedSize = 7 (bytes)
        // TallyValue.SerializedSize = 8 (bytes)
        // EntrySize = TallyKey.SerializedSize + TallyValue.SerializedSize
        m_Results.Add(
            TallyKey.FromLittleEndianData(buffer[..TallyKey.SerializedSize]),
            TallyValue.FromLittleEndianData(buffer[TallyKey.SerializedSize..])
        );
    }
    if (bytesRead is not 0) throw new InvalidDataException(
        message: "The input stream has an unexpected length. It is likely ill-formed."
    );
}

Upvotes: 0

Views: 86

Answers (1)

dbc
dbc

Reputation: 117104

Assuming that your Tally.Read() is just reading the stream byte-by-byte using DeflateStream.ReadByte(), then apparently Microsoft's DeflateStream is much more performant when inflating chunk-by-chunk rather than byte-by-byte. Accepting this as given, if your algorithm needs to read byte-by-byte you can easily match the performance of copying to a MemoryStream and reading from the copy by wrapping the DeflateStream in a BufferedStream with an appropriately large buffer size, say 8192 bytes:

const int BufferSize = 8192;

using FileStream file = new(fileName, FileMode.Open, FileAccess.Read, FileShare.Read);
using DeflateStream ds = new(file, CompressionMode.Decompress, leaveOpen: true);
using BufferedStream inputStream = new (ds, BufferSize);
tally.Read(inputStream); // Fast

With this fix, the overall time required becomes slightly faster that copying to a temporary MemoryStream.

Alternatively, as Jon Skeet suggested in comments, if you could modify your Tally.Read() method to read the stream into a buffer, then iterate through the buffer, you could get similar performance.

For instance, if your original Tally looked like:

class Tally
{
    public long Count { get; set; } = 0;

    public void Read(Stream s)
    {
        while (s.ReadByte() is var b && b >= 0)
            Count += b;
    }
}

If I modify it to use a small Span<byte> buffer like so:

const int BufferSize = 128;

public void Read(Stream s)
{
    Span<byte> span = stackalloc byte [BufferSize];

    while (s.Read(span) is var count && count > 0)
        foreach (var b in span.Slice(0, count))
            Count += ((int)b);
}

Then the performance once again becomes faster than using a MemoryStream.

A demo here shows the following results:

Directly reading from DeflateStream (Count = 122842320) took 111 ms.
Reading from a MemoryStream copied from a DeflateStream (Count = 122842320) took 14 ms.
Reading from a BufferedStream wrapping a DeflateStream (Count = 122842320) took 10 ms.
Directly reading from DeflateStream using a 128-byte span (Count = 122842320) took 11 ms.

As you can see using a BufferedStream or a Span<byte> buffer offers a 10x speedup over reading byte-by-byte from a DeflateStream.

Upvotes: 3

Related Questions