Reputation: 8014
I receive the collections in chunks from a server and I want them to write to file in a way so I can read them back one-by-one later. My objects are fixed size meaning the class only contains objects of types double
, long
and DateTime
.
I already serialize and deserialize objects using below methods at different places in my project:
public static T Deserialize<T>(byte[] buffer)
{
using (MemoryStream stream = new MemoryStream(buffer))
{
return Serializer.Deserialize<T>(stream);
}
}
public static byte[] Serialize<T>(T message)
{
using (MemoryStream stream = new MemoryStream())
{
Serializer.Serialize(stream, message);
return stream.ToArray();
}
}
But, even if this could work, I still think it will produce a larger output file because I believe protobuf stores some information about field names (in its own way). But I could create the byte[]
using BinaryWriter without having any info of field names. I know I need to make sure that I read them back in the right order but this could still make some meaningful impact on the output size file I think especially when the number of objects in the collection is really huge.
Do you think is there a way to efficiently write collections in parts and be able to read them one-by-one and also having minimum output files and memory footprint while reading as my collections are really large containing years of market data that I need to read and process. I need to just read the object once, process it, and forget about it. I do not have any need to keep objects in memory.
Upvotes: 1
Views: 547
Reputation: 1063884
Protobuf doesn't store field names, but it does use a field prefix that is an encoded integer. For storing multiple objects, you would typically use the *WithLengthPrefix
overloads; in particular, DateTime
has no reliable fixed length encoding.
However! In your case, perhaps a serializer isn't the right tool. I would consider:
readonly struct
composed of a double
and two long
(or three long
if you need high precision epoch time)Span<byte>
over the memory mapped file (or a section thereof)Span<byte>
to a Span<YourStruct>
using MemoryMarshal.Cast
et voila, direct access to your values all the way to the file system.
Upvotes: 1