Reputation: 1594
I have a WinForm application which needs performance enhancement. I write a very large number of files(around 200-500kbs) which are protobuff serialized and written using the normal File.IO total size of those files go > 3Gb(the count is around 10,000). Now at a periodic frequency of 5 and 7 min, I read half files of them one by one and merge data and then serialize them again. As we all know this process consumes a very large amount Ram at the above-mentioned frequency.
I came towards a solution about using Memory Mapped File
and got below test
code
byte[] Buffer = GZipCompressor.ConvertToByteStream<OHLCData>(sampleObj);
using (MemoryMappedFile mmf = MemoryMappedFile.CreateNew("test", s.Length + 25))
{
MemoryMappedViewAccessor accessor = mmf.CreateViewAccessor();
accessor.Write(54, (ushort)Buffer.Length);
accessor.WriteArray(54 + 2, Buffer, 0, Buffer.Length);
Console.WriteLine(proc.PrivateMemorySize64 / 1024);
}
using (MemoryMappedFile mmf = MemoryMappedFile.CreateNew("test", s.Length + 25))
{
MemoryMappedViewAccessor accessor = mmf.CreateViewAccessor();
ushort Size = accessor.ReadUInt16(54);
byte[] buffer = new byte[Size];
accessor.ReadArray(54 + 2, buffer, 0, buffer.Length);
Console.WriteLine(proc.PrivateMemorySize64 / 1024);
}
//then I convert the buffer back to class..
Now By using the above code, I am not able to achieve performance Improvement I am seeking for, My Ram usage is approx. same as previous(or at least not as per expectation).
I have another idea of creating a zip of file group using Zip-Archive and assigning them to the MMF
.
My question:
Note: Creating a dictionary for data and storing that dictionary is not feasible and possible for me as per my code structure.
Edit:- Note in above sample I am not just appending data into end I have to make changes in previous data too, like removing the deprecated data form start.
Example representaion of task.
File stored:-
1,1
2,1
3,1
4,1
Data to be merged:-
3,2
5,2
Final output :
2,1
3,3
4,1
5,2
Note in above example deprecated 1,1 is remove and 3,1 is updated to 3,3 and 5,2 is new element
Upvotes: 1
Views: 491
Reputation: 396
Hy, by reading your post I got a bit confused.
You are getting data that you serialize and store to the disk. This creates the following problem you have to load the data again this is one buffer then allocate or have a second buffer to deserialize. What would happen if you would save the data in an non serialized state?
The second thing that i got confused do you merge files that have been merged before? For example you get files foo1 and foo2 and merge them into file foo12, at some point in time later you get a third file foo3 and merge it to file foo12? In any case then you have a hefty memory consumption. Check your data if you can do bit packing or review data types that you do not need like reduce int to uint_8 or use something else.
If you use protobuf to serialize only to compress data this is not a good idea. There are compression algorithms that do that far better and very fast. Are you bound to the protobuf ?
One more question is why is your data not minimal. For example:
1,4
2,4
3,4
4,4
Could be:
T 4
1
2
3
4
With that you have less information to handle. Yes you have to keep track of some other stuff but nothing is perfect.
Upvotes: 1