Ehssan
Ehssan

Reputation: 759

HashSet<T> performance (compared to ObservableCollection<T>)?

I am currently working on a project where I have to manage large sets of unique elements. Each element has ~20 Properties, and every element has a public property DateTime.

The property DateTime is not unique, so I cannot use a generic dictionary to store my data.

Currently I put those elements into an ObservableCollection, but the perfomance of removing elements from the collection is incredibly slow, I end up waiting ~20 seconds to remove ~7000 elements from a collection of ~25.000 elements.

(The search operation seems to be quite efficient, it takes only ~30 ms to find 80 randomly selected elements from an unsorted collection of 300.000 elements).

Each element implements the GetHashCode() method by simply returning the DateTime.GetHashCode().

I thought that using a HashSet rather than an ObservableCollection would increase my performance quite a bit, but it doesn't seem to have an effect at all...

And using a generic dictionary is even worse...

Isn't the HashSet more powerful than an ObservableCollection if the elements have "good" hash functions (there are very few elements that have the same hash code)??

Upvotes: 3

Views: 2721

Answers (3)

Oliver
Oliver

Reputation: 45109

Well Marcel answer is correct, but if performance really matters you can slightly improve his equals method:

class MyObject
{
    public Guid Id { get; set; }

    public override bool Equals(object other)
    {
        MyObject myObj = obj as MyObject;

        if (myObj != null)
        {
            // use the 'Id' property as identifier
            return myObj.Id == this.Id;
        }

        // is not a 'MyObject' based object
        return base.Equals(other);
    }
}

With this approach you avoid the costly function to check if a object is of a specifc type two times by calling it only once and doing a fast null check. For more informations about it, you could take a look into this article from Eric.

Upvotes: 2

user687474
user687474

Reputation:

You can optimize the performance of the ObservableCollection by reducing the change notifications. I've written a custom collection class, the ItemCollection, with an update mechanism (BeginUpdate/EndUpdate):

  ItemCollection<Customer> customers = new ItemCollection<Customer>
  customers.BeginUpdate();
  customers.Add( new Customer( "Joe", "Smith" ) );
  customers.Add( new Customer( "Mary", "Jones" ) );
  customers.Add( new Customer( "Lisa", "Black" ) );
  customers.Add( new Customer( "Peter", "Brown" ) );
  customers.EndUpdate();

Article with source code: Presentation Patterns for XAML based Applications.

Upvotes: 2

user432219
user432219

Reputation:

You have to override the Equals method of your objects.

Because the HashSet uses an internal IEqualityComparer instance that usually first checks for (null) and then compares a "non-null" item with another item by using the overridden Equals method:

class MyObject
{
    public Guid Id { get; set; }

    public override bool Equals(object other)
    {
        if (other is MyObject)
        {
            // use the 'Id' property as identifier

            MyObject myObj = (MyObject)obj;
            return myObj.Id == this.Id;
        }

        // is not a 'MyObject' based object
        return base.Equals(other);
    }
}

You can also use strings or any other kind of objects that are comparable with your objects.

EDIT:

So you can use the HashSet instead of OberservableCollection. The last collection type is normally slower because on each collection change (add, remove, clear, insert, etc.) the PropertyChanged and the CollectionChanged events are fired.

Upvotes: 3

Related Questions