Benny Ae
Benny Ae

Reputation: 2016

ConcurrentDictionary adding same keys more than once

I want to use ConcurrentDictionary to check if this data key has been added before, but it looks like I can still add keys which added before.

code:

    public class pKeys
    {
        public pKeys()
        { }
        public pKeys(long sID, long pID)
        {
            this.seID = sID;
            this.pgID = pID;

        }
        public long seID;
        public long pgID;
    }

    public static ConcurrentDictionary<pKeys, bool> existenceDic 
= new ConcurrentDictionary<pKeys, bool>();

test code:

    pKeys temKey = new pKeys(111, 222);
    bool res = existenceDic.TryAdd(temKey, true);
    Console.WriteLine(res);

    temKey = new pKeys(111, 222);
    res = existenceDic.TryAdd(temKey, true);
    Console.WriteLine(res);

result:

true
true

Upvotes: 2

Views: 5108

Answers (1)

Peter Duniho
Peter Duniho

Reputation: 70652

You can add two different instances containing the same values, because you haven't overridden GetHashCode() and Equals(). This causes the default equality comparison to be used, which for reference types simply compares the references themselves. Two different instances are always considered as different values in this case.

One option is to make your type a struct instead of class. This uses a default comparison that will take into account the field values.

Alternatively, you can go ahead and override GetHashCode() and Equals(). For example:

public class pKeys
{
    public pKeys()
    { }
    public pKeys(long sID, long pID)
    {
        this.seID = sID;
        this.pgID = pID;

    }
    public readonly long seID;
    public readonly long pgID;

    public override int GetHashCode()
    {
        return seID.GetHashCode() * 37 + pgID.GetHashCode();
    }

    public override bool Equals(object other)
    {
        pKeys otherKeys = other as pKeys;

        return otherKeys != null &&
            this.seID == otherKeys.seID &&
            this.pgID == otherKeys.pgID;
    }
}

Notes:

  • The hash code is calculated based on the hash codes of the individual values. One is multiplied by 37, which is simply a convenient prime number; some people prefer to use a much larger prime number for better "mixing". For most scenarios, the above will work fine IMHO.
  • Note that your proposed solution, converting the values to strings, concatenating them, and returning the hash code of that has several negative aspects:
    • You have to create three string instances just to generate the hash code! The memory overhead alone is bad enough, but of course there is the cost of formatting the two integers as well.
    • Generating a hash code from a string is more expensive computationally than from an integer value
    • You have a much higher risk of a collision, as it's easier for disparate values to result in the same string (e.g. (11, 2222) and (111, 222))
  • I added readonly to your fields. This would be critical if you decide to make the type a struct (i.e. even if you don't override the methods). But even for a class, mutable types that are equatable are a huge problem, because if they change after they are added to a hash-based collection, the collection is effectively corrupted. Using readonly here ensures that the type is immutable. (Also, IMHO public fields should be avoided, but if one must have them, they should definitely be readonly even if you don't override the equality methods).
  • Some people prefer to check for exact type equality in the Equals() method. In fact, this is often a good idea…it simplifies the scenarios where objects are compared and makes the code more maintainable. But for the sake of example, assignability (i.e. as) is easier to read, and is valid in many scenarios anyway.

See General advice and guidelines on how to properly override object.GetHashCode() for additional guidance.

Upvotes: 3

Related Questions