Reputation: 457
I'm having a little resource problem here. It seems that .NET is creating an aweful lot of memory overhead and/or doesn't release memory it shouldn't need. But to the problem:
I have an object which reads a STL file of the following class:
public class cSTLBinaryDataModel
{
public byte[] header { get; private set; }
public UInt32 triangleCount { get { return Convert.ToUInt32(triangleList.Count); } }
public List<cSTLTriangle> triangleList { get; private set; }
public cSTLBinaryDataModel()
{
header = new byte[80];
triangleList = new List<cSTLTriangle>();
}
public void ReadFromFile(string in_filePath)
{
byte[] stlBytes;
//Memory logpoint 1
stlBytes = File.ReadAllBytes(in_filePath);
//Memory logpoint 2
ReadHeader(stlBytes.SubArray(0, cConstants.BYTES_IN_HEADER));
ReadTriangles(stlBytes.SubArray(cConstants.BYTES_IN_HEADER, stlBytes.Length - cConstants.BYTES_IN_HEADER));
//Evaluate memory logpoints here
}
private void ReadHeader(byte[] in_header)
{
header = in_header;
}
private void ReadTriangles(byte[] in_triangles)
{
UInt32 numberOfTriangles = BitConverter.ToUInt32(cHelpers.HandleLSBFirst(in_triangles.SubArray(0, 4)), 0);
//Memory logpoint 3
for (UInt32 i = 0; i < numberOfTriangles; i++)
{
triangleList.Add(new cSTLTriangle(in_triangles.SubArray(Convert.ToInt32(i * cConstants.BYTES_PER_TRIANGLE + 4), Convert.ToInt32(cConstants.BYTES_PER_TRIANGLE))));
}
//Memory logpoint 4
}
}
My STL file is quite big (but can get even bigger); it contains 10533050 triangles, so it's roughly 520 MB in size on disk. The class cSTLTriangle
which is added to triangleList
is the following:
public class cSTLTriangle
{
public cVector normalVector { get; private set; }
public cVector[] vertices { get; private set; }
public UInt16 attributeByteCount { get; private set; }
public bool triangleFilledWithExternalValues { get; private set; }
public cSTLTriangle(byte[] in_bytes)
{
Initialize();
normalVector = new cVector(BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(0, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(4, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(8, 4)), 0));
vertices[0] = new cVector(BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(12, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(16, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(20, 4)), 0));
vertices[1] = new cVector(BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(24, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(28, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(32, 4)), 0));
vertices[2] = new cVector(BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(36, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(40, 4)), 0),
BitConverter.ToSingle(cHelpers.HandleLSBFirst(in_bytes.SubArray(44, 4)), 0));
attributeByteCount = BitConverter.ToUInt16(cHelpers.HandleLSBFirst(in_bytes.SubArray(48, 2)), 0);
triangleFilledWithExternalValues = true;
}
public cSTLTriangle(cVector in_vertex1, cVector in_vertex2, cVector in_vertex3)
{
Initialize();
vertices[0] = in_vertex1;
vertices[1] = in_vertex2;
vertices[2] = in_vertex3;
normalVector = cVectorOperations.CrossProduct(cVectorOperations.GetDirectionVector(vertices[0], vertices[1]), cVectorOperations.GetDirectionVector(vertices[0], vertices[2]));
}
/// <summary>
/// Resets object to a defined state
/// </summary>
private void Initialize()
{
vertices = new cVector[3];
//from here on not strictly necessary, but it helps with resetting the object after an error
normalVector = new cVector(0, 0, 0);
vertices[0] = new cVector(0, 0, 0);
vertices[1] = new cVector(0, 0, 0);
vertices[2] = new cVector(0, 0, 0);
attributeByteCount = 0;
triangleFilledWithExternalValues = false;
}
}
With the class cVector
being: (Sorry for this much code)
public class cVector:ICloneable
{
public float component1 { get; set; }
public float component2 { get; set; }
public float component3 { get; set; }
public double Length { get { return Math.Sqrt(Math.Pow(component1, 2) + Math.Pow(component2, 2) + Math.Pow(component3, 2)); } }
public cVector(float in_value1, float in_value2, float in_value3)
{
component1 = in_value1;
component2 = in_value2;
component3 = in_value3;
}
public object Clone()
{
return new cVector(component1, component2, component3);
}
}
If I count what sizes the used types in my classes have, it amounts to 51 bytes for one instance of cSTLTriangle
. I am aware that there has to be an overhead to accomodate functions and such. But, if I multiply this size by the number of triangles, I end up at 512,3 MB, which is quite in tune with the actual file size. I would imagine the triangleList
takes up roughly the same amount of memory (again allowing for slight overhead, it's a List<T>
nontheless), but no! (Using GC.GetTotalMemory(false) to evaluate memory)
From Logpoint 1 to Logpoint 2, there is an increase by 526660800 bytes, this is quite accurately the size of the STL file which is loaded into the byte array.
Between Logpoint 3 and Logpoint 2 there is an increase of roughly the same amount, understandable, because I pass a subarray to the ReadTriangles
method. The SubArray
is code I found here on SO (could this be the devil in desguise?):
public static T[] SubArray<T>(this T[] data, int index, int length)
{
T[] result = new T[length];
Array.Copy(data, index, result, 0, length);
return result;
}
Things get ridiculous at the next Logpoint. Between Logpoint 4 and Logpoint 3 there is an increase in memory usage of about roughly 4.73 times the size of the original STL file (As you can see, I make heavy use of .SubArray
while parsing each triangle).
When I let the program continue, there is no significant increase in memory usage: good, but also no decrease at all: bad. I would expect the byte[]
holding the file to release memory, since it goes out of scope, as does the sub array I passed to ReadTriangles(byte[] ...)
, but somehow they don't. And I end up with an "overhead" of 5.7 times the size of my raw STL data.
Is this usual behaviour? Does the .NET runtime keep memory allocated (even if it has been extended to disk), just like Photoshop does, once it got hold of some jucy RAM? How can I reduce the memory footprint of this combination of classes?
EDIT:
GC.Collect()
after the object creation was done (so outside the object itself) and nothing happened. Only after setting the object reference to null
I got the memory backUpvotes: 3
Views: 609
Reputation: 5514
Memory overhead
Your cVector
class adds alot of memory overhead. On a 32-bit system, any reference object has a memory overhead of 12 bytes (although 4 of those are free to be used by fields if possible), if I recall correctly. Let's go with an overhead of 8 bytes. So in your case with 10,000,000 triangles, each containing 4 vectors, that adds upp to:
10,000,000 * 4 * 8 = 305 MB of overhead
If you're running on a 64-bit system it's twice that:
10,000,000 * 4 * 16 = 610 MB of overhead
On top of this, you also have the overhead of the four references each cSTLTriangle
will have to the vectors, giving you:
10,000,000 * 4 * 4 = 152 MB (32-bit)
10,000,000 * 4 * 8 = 305 MB (64-bit)
As you can see this all builds up to quite a hefty bit of overhead.
So, in this case, I would suggest you make cVector
a struct
. As discussed in the comments, a struct can implement interfaces (as well as properties and methods). Just be aware of the caveats that @Jcl mentioned.
You have the same issue with your cSTLTriangle
class (about 76/152 MB overhead for 32-bit and 64-bit, respectively), although at its size I'm not sure I want to recommend going with struct on that. Others here might have useful insights on that matter.
Additionally, due to padding and object layout, the overhead might actually be even larger, but I haven't taken that into account here.
List capacity
Using the List<T>
class with that amount of objects can cause some wasted memory. As @Matthew Watson mentions, when the list's internal array has no more room, it will be expanded. In fact, it will double it's capacity every time that happens. In a test with your number of 10533050 entries, the capacity of the list ended up at 16777216 entries, giving an overhead of:
( 16777216 - 10533050 ) * 4 byte reference = 23 MB (32-bit)
( 16777216 - 10533050 ) * 8 byte reference = 47 MB (64-bit)
So since you know the number of triangles in advance, I would recommend just going with a simple array. Manually setting the Capacity
of a list works too.
Other issues
The other issues that have been discussed in the comments should not give you any memory overhead, but they sure will put alot of unnecessary pressure on the GC. Especially the SubArray
method which, while very practical, will create many millions of garbage arrays for the GC to handle. I suggest skipping that and indexing into the array manually, even if it's more work.
Another issue is reading the entire file at once. This will be both slower and use more memory than reading it piece by piece. Directly using a BinaryReader
as others have suggested might not be possible due to the endianness issues you need to deal with. One complicated option could be to use memory mapped files, that would let you access the data without having to care about if it's been read or not, leaving the details to the OS.
(man I hope I got all these numbers right)
Upvotes: 3
Reputation: 109597
There are a couple of things you can try to decrease memory usage.
Firstly, if possible you should rewrite your file loading code so that it only loads the data it needs rather than loading the whole file at once.
For example, you could read the header as a single block, and then read the data for each triangle as a single block (in a loop).
Secondly, it's possible that your large object heap is suffering from fragmentation - and the garbage collector doesn't move large objects, so it can't be defragmented. (This issue if fixed for .Net 4.51, but you have to explicitly enable large object heap defragmentation, and instigate it explicitly.)
You may be able to mitigate this problem by pre-sizing your triangleList
.
At the moment, you add each triangle to triangleList
in turn, starting with a list with zero capacity. This means that every so often the list's capacity will be exceeded, causing it to be expanded.
When the list is expanded by adding an item to it when it's at capacity, it:
The problem where is twofold:
Since you know in advance the maximum size of the triangle list you can solve this issue by setting the list's capacity before adding items to it:
triangleList.Capacity = numberOfTriangles;
Upvotes: 2
Reputation: 173
After logpoint 2 maybe you could try splitting out the code a bit so that you have a
byte[] header
byte[] triangles
and once you're done splitting the original byte array set it to null and then you can use System.GC.Collect()
to force the garbage collector to run. This should save you a bit of memory.
public void ReadFromFile(string in_filePath)
{
byte[] stlBytes;
//Memory logpoint 1
stlBytes = File.ReadAllBytes(in_filePath);
//Memory logpoint 2
byte[] header = stlBytes.SubArray(0, cConstants.BYTES_IN_HEADER);
byte[] triangles = stlBytes.SubArray(cConstants.BYTES_IN_HEADER, stlBytes.Length - cConstants.BYTES_IN_HEADER);
ReadHeader(header);
ReadTriangles(triangles);
stlBytes = null;
System.GC.Collect();
//Evaluate memory logpoints here
}
Upvotes: 0