Reputation: 438
What are the memory limitations of a Hashset<string>
in C#?
I've seen that .NET has a memory limit of 2Gb per object? Is this information still accurate ? Does it apply for Hashsets?
I'm currently working on an application that works with a large hashset and I've seen that as soon as I build the dll's for 64 bit environment I get OutOfMemory only when my 8GB RAM laptop reaches its memory limits.
If I would of had 16Gb RAM would the object increase until it reaches the hardware limitations?
Upvotes: 5
Views: 6665
Reputation: 109822
There is a 2GB limit per object, but remember that a reference type only uses the pointer size (8 bytes for x64) when it's a field in a class.
Array memory sizes are computed as follows (ignoring fixed overhead):
For arrays of struct types:
For arrays of reference types:
So a HashSet could reference objects totalling a lot more than the 2GB limit. It's just that if you add up the size taken by each field in the class - 64 bits for reference types, and the full size for struct types - it must be less than 2GB.
You could have a class that contained 16x1GB arrays of bytes, for instance.
Also note that it's possible to configure an application to allow arrays larger than 2GB in size - although the maximum number of elements in a single dimensional array still cannot exceed 2G (2*1024*1024*1024).
I suspect that the objects that you are storing in the HashSet are reference types, so it's only using 64 bits for each one in the internal HashSet array, while the full size of each of your objects is much larger than 64 bits - which gives a total size in excess of 2GB.
Looking at the referencesource for HashSet shows that the following arrays are used:
private int[] m_buckets;
private Slot[] m_slots;
Where Slot
is defined like so:
internal struct Slot {
internal int hashCode; // Lower 31 bits of hash code, -1 if unused
internal T value;
internal int next; // Index of next entry, -1 if last
}
It looks like each Slot
struct occupies 24 bytes on x64 when T
is a reference type, which means that HashSet will throw OutOfMemory when the number of slots in use exceeds 2GB/24 = 85M elements
(If T
is a struct then depending on its size you'll run out of memory a lot sooner.)
Upvotes: 6