bitinn
bitinn

Reputation: 9348

A good way to serialize a known array of small structs

I have a large array of small structs that I would like to serialize to file.

The struct:

public struct Voxel
{
  public byte density;
  public byte material;
}

While there are quite a few serialization libraries that can do general serialization very efficiently, I suspect we can do even better in terms of on disk size and serialization/deserialization speed, given we know and control this struct.

This struct is pretty final, so we can do without the fancy versioning of many serialization library supports.

From my search, it seems like Marshal might be a decent way to do such a thing, but I don't want to worry about things like Endianness.

So I wonder, what might be some good ways to serialize such data. Assuming the array size can be anywhere from 100 to 1mil?

(Also assuming we are not afraid to store them in different formats such that RLE can reduce the on-disk size even more.)

Upvotes: 2

Views: 635

Answers (2)

bitinn
bitinn

Reputation: 9348

For future readers, I have figure out a solution that allow me to use @Marc Gravell's proposed solution within Unity Engine, which only allows for .Net Standard 2.0;

The trick was to get the high performance package from Microsoft:

https://learn.microsoft.com/en-us/windows/communitytoolkit/high-performance/introduction

This means that you can use it from anything from UWP or legacy .NET Framework applications, games written in Unity, cross-platform mobile applications using Xamarin, to .NET Standard libraries and modern .NET Core 2.1 or .NET Core 3.1 applications.

It supports Stream.Write and Stream.Read with Span<T> through extension methods:

https://learn.microsoft.com/en-us/dotnet/api/microsoft.toolkit.highperformance.extensions.streamextensions?view=win-comm-toolkit-dotnet-6.1

I also compared on-disk size of binary serialization:

  • In memory: 8KB (16x16x16 array)
  • MessagePack format: 97KB
  • MemoryMarshal.Cast: 8KB

So it's working as expected!

Upvotes: 1

Marc Gravell
Marc Gravell

Reputation: 1063068

Assuming you're using a recent framework: spans are your friend. As a trivial way of writing them:

Voxel[] arr = ...
var bytes = MemoryMarshal.Cast<Voxel, byte>(arr);
using (var s = File.OpenWrite("some.path"))
{
    s.Write(bytes);
}

Reading is a little harder, but not much:

Voxel[] arr;
using (var s = File.OpenRead("some.path"))
{
    int len = checked((int)(s.Length / Unsafe.SizeOf<Voxel>())), read;
    arr = new Voxel[len];
    var bytes = MemoryMarshal.Cast<Voxel, byte>(arr);
    while (!bytes.IsEmpty && (read = s.Read(bytes)) > 0)
    {
        bytes = bytes.Slice(read);
    }
}

Note that this presumes you want to start with a vector (Voxel[]); if you're happy to take a futher step down the rabbit hole, "memory mapped files" are also an option here, again using Span<T> (or Memory<T>) - then it becomes truly zero copy (your live data is the file, via OS magic).

Upvotes: 3

Related Questions