markmuetz
markmuetz

Reputation: 9664

OutOfMemoryException thrown while serializing list of objects that contain a large byte array

I'm trying to serialize a reasonably large amount of data with protobuf.net. I'm hitting problems with OutOfMemoryExceptions being thrown. I'm trying to stream the data using IEnumerable<DTO> so as not to use too much memory. Here's a simplified version of the program that should cause the error:

class Program
{
    static void Main(string[] args)
    {
        using (var f = File.Create("Data.protobuf"))
        {
            ProtoBuf.Serializer.Serialize<IEnumerable<DTO>>(f, GenerateData(1000000));
        }

        using (var f = File.OpenRead("Data.protobuf"))
        {
            var dtos = ProtoBuf.Serializer.DeserializeItems<DTO>(f, ProtoBuf.PrefixStyle.Base128, 1);
            Console.WriteLine(dtos.Count());
        }
        Console.Read();
    }

    static IEnumerable<DTO> GenerateData(int count)
    {
        for (int i = 0; i < count; i++)
        {
            // reduce to 1100 to use much less memory
            var dto = new DTO { Data = new byte[1101] };
            for (int j = 0; j < dto.Data.Length; j++)
            {
                // fill with data
                dto.Data[j] = (byte)(i + j);
            }
            yield return dto;
        }
    }
}

[ProtoBuf.ProtoContract]
class DTO
{
    [ProtoBuf.ProtoMember(1, DataFormat=ProtoBuf.DataFormat.Group)]
    public byte[] Data
    {
        get;
        set;
    }
}

Interestingly, if you reduce the size of the array on each DTO to 1100 the problem goes away! In my actual code, I'd like to do something similar but it's an array of floats that I'll be serializing, not bytes. N.B. I think you can skip the filling with data part to speed up the problem.

This is using protobuf version 2.0.0.594. Any help would be much appreciated!

EDIT:

Same problem seen with version 2.0.0.480. Code wouldn't run with version 1.0.0.280.

Upvotes: 4

Views: 2311

Answers (2)

Marc Gravell
Marc Gravell

Reputation: 1063338

k; this was was some unfortunate timing - basically, it was only checking whether it should flush whenever the buffer got full, and as a consequence of being in the middle of writing a length-prefixed item, it was never able to properly flush at that point. I've added a tweak so that whenever it finds it reaches a flushable state, and there is something worth flushing (currently 1024 bytes), then it will flush more aggressively. This has been committed as r597. With that patch, it now works as expected.

In the interim, there is a way of avoiding this glitch without changing version: iterate over the data at source, serializing each individually with SerializeWithLengthPrefix specifying prefix-style base-128, and field-number 1; this is 100% identical in terms of what goes over the wire, but has a separate serialization cycle for each:

using (var f = File.Create("Data.protobuf"))
{
    foreach(var obj in GenerateData(1000000))
    {
        Serializer.SerializeWithLengthPrefix<DTO>(
            f, obj, PrefixStyle.Base128, Serializer.ListItemTag);
    }
}

Thanks for noticing ;p

Upvotes: 3

Kiril
Kiril

Reputation: 40375

It seems that you're passing the 1.5 GB limit: Allocating more than 1,000 MB of memory in 32-bit .NET process

You've already noticed that when you reduce the sample size, your application runs fine. This is not an issue with protobuf (I presume), but with your attempt to create an array which requires more than 1.5 GB of memory to be allocated.

Update

Here is a simple test:

byte[] data = new byte[2147483648];

That should cause an OutOfMemoryException, so would this:

byte[][] buffer = new byte[1024][];
for (int i = 0; i < 1024; i++)
{
    buffer[i] = new byte[2097152];
}

Something is aggregating your data bytes into a contiguous container of more than 1.5 GB.

Upvotes: 2

Related Questions