Sergey Rotbart
Sergey Rotbart

Reputation: 319

C# Dictionary internal array size

I'm not sure if this is the right question to ask here but please don't kill me :)

I have an argument with a friend about C#'s dictionary… She tells me that if I have lets say dictionary with 1 element. And the hash code for the key is 100000 then the internal array of the dictionary will be at the size of 100000!

Is it true? I tried to find answers on Google but for some reason I didn't find for that question.

Upvotes: 7

Views: 9083

Answers (7)

Yuriy Look
Yuriy Look

Reputation: 11

As an update, now Microsoft provides reference code for .NET implementation, which makes things easier than going through decompilation. Specifically for this question, please see

https://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs https://referencesource.microsoft.com/#mscorlib/system/collections/hashtable.cs

You can see that the default capacity is 3, as mentioned in the previous answers.

Upvotes: 0

tomfanning
tomfanning

Reputation: 9660

The default constructor of Dictionary, "has the default initial capacity", according to MSDN.

It also states:

If you can estimate the size of the collection, using a constructor that specifies the initial capacity eliminates the need to perform a number of resizing operations while adding elements to the Dictionary.

One such constructor simply takes an Int32, which initialises the internal storage as follows:

The initial number of elements that the Dictionary can contain.

What the "default initial capacity" of the dictionary actually is is an internal implementation detail of the class and as such not exposed in the documentation or public API.

Disassembling mscorlib with ilspy and examining the default constructor shows that it is implemented as follows:

public Dictionary() : this(0, null)
{
}

That chained constructor is implemented as follows:

public Dictionary(int capacity, IEqualityComparer<TKey> comparer)
{
    if (capacity < 0)
    {
        ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity);
    }

    if (capacity > 0)
    {
        this.Initialize(capacity);
    }

    this.comparer = (comparer ?? EqualityComparer<TKey>.Default);
}

i.e. Initialize() is not called at all by the default constructor, either directly or indirectly.

Initialize() is the method that sets up the internal storage.

So actually, if you call the default constructor, the internal storage size is not even initialised until you first add an item. So it has essentially a zero size.

Initialize() is eventually called with a value of zero the first time you call .Add(), which sets things up.

private void Initialize(int capacity)
{
    int prime = HashHelpers.GetPrime(capacity);
    this.buckets = new int[prime];
    for (int i = 0; i < this.buckets.Length; i++)
    {
        this.buckets[i] = -1;
    }
    this.entries = new Dictionary<TKey, TValue>.Entry[prime];
    this.freeList = -1;
}

GetPrime(0) returns 3, so this.buckets is set to an array containing three integers.

The line that assigns a value to this.entries looks a little odd but I don't see where 100000 comes into this.

Short answer
I think your colleague is wrong.

Upvotes: 3

Matthew
Matthew

Reputation: 25753

The hash code (ie GetHashCode) is used for placing items in the buckets that the dictionary uses.

The actual capacity used is based upon the number of elements in the dictionary.

The (probably not accurate) pseudo code for what GetHashCode is used for is like so.

List<List<KeyValuePair<T,J>>> buckets; // let's assume this get's allocated somewhere (the dictionary allocates this internally)
...
public J GetValueFromDictionary(T key)
{
    int bucketIndex = key.GetHashCode() % buckets.Length;
    return buckets[bucketIndex].Find(x => x.Key == key).Single().Value;
}

Upvotes: 2

Mike Perrenoud
Mike Perrenoud

Reputation: 67898

That's hardly the case because the storage of a dictionary is much more complex than that. Further, the value of the hash code, which may be the key, does not in any way define the size of the dictionary (I have no idea how that could even be fabricated).

Now, let's break down the storage of a dictionary.

As long as an object is used as a key in the Dictionary, it must not change in any way that affects its hash value. Every key in a Dictionary must be unique according to the dictionary's equality comparer. A key cannot be a null reference (Nothing in Visual Basic), but a value can be, if the value type TValue is a reference type.

Dictionary requires an equality implementation to determine whether keys are equal. You can specify an implementation of the IEqualityComparer generic interface by using a constructor that accepts a comparer parameter; if you do not specify an implementation, the default generic equality comparer EqualityComparer.Default is used. If type TKey implements the System.IEquatable generic interface, the default equality comparer uses that implementation.

Though it is likely that the hash code will be used because EqualityComparer.Default is defined as such:

The Default property checks whether type T implements the System.IEquatable generic interface and if so returns an EqualityComparer that uses that implementation. Otherwise it returns an EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T.

It is by no means guaranteed that is going to be how the key is generated. So, I hope this helps you in your argument.

Bottom line, there is no way that the hash code determines the internal size of the dictionary, a dictionary is mutable and grows as items are added as stated by Microsoft:

The capacity of a Dictionary is the number of elements the Dictionary can hold. As elements are added to a Dictionary, the capacity is automatically increased as required by reallocating the internal array.

Your friend needs to do her research before arguing. Whew!

Upvotes: 0

MoonKnight
MoonKnight

Reputation: 23833

No this is not true. If I have

Dictionary<int, object[]> dict = new Dictionary<int, object[]>() 
{
    {10000, new object[] { 1, 2, 3, 4 }}
};

This dictionary will contain one object array with the index 10000, not 9999 empty object array slots followed by the object we entered above. The answer is no, you friend is wrong.

I hope this helps.

Upvotes: -1

Deepansh Gupta
Deepansh Gupta

Reputation: 593

Just use reflector to decompile the code and verify it yourself.

Upvotes: 0

AgentFire
AgentFire

Reputation: 9780

No, it is not. The sources of Dictionary<,> class prove that.

Upvotes: 0

Related Questions