Alexander Smirnov
Alexander Smirnov

Reputation: 1655

List<T> vs HashSet<T> - dynamic collection choice is efficient or not?

var usedIds = list.Count > 20 ? new HashSet<int>() as ICollection<int> : new List<int>();

Assuming that List is more performant with 20 or less items and HashSet is more performant with greater items amount (from this post), is it efficient approach to use different collection types dynamicaly based on the predictable items count?

All of the actions for each of the collection types will be the same.

PS: Also i have found HybridCollection Class which seems to do the same thing automaticaly, but i've never used it so i have no info on its performance either.

EDIT: My collection is mostly used as the buffer with many inserts and gets.

Upvotes: 2

Views: 1657

Answers (6)

quadroid
quadroid

Reputation: 8940

Hashtables like Hashset<T> and Dictionary<K,T> are faster at searching and inserting items in any order.

Arrays T[] are best used if you always have a fixed size and a lot of indexing operations. Adding items to a array is slower than adding into a list due to the covariance of arrays in c#.

List<T> are best used for dynamic sized collections whith indexing operations.

I don't think it is a good idea to write something like the hybrid collection better use a collection dependent on your requirements. If you have a buffer with a lof of index based operations i would not suggest a Hashtable, as somebody already quoted a Hashtable by design uses more memory

Upvotes: 1

nawfal
nawfal

Reputation: 73173

is it efficient approach to use different collection types dynamicaly based on the predictable items count?

It can be depending on what you mean by "efficiency" (MS offers HybridDictionary class for that, though unfortunately it is non generic). But irrespective of that its mostly a bad choice. I will explain both.

From an efficiency standpoint:

  1. Addition will be always faster in a List<T>, since a HashSet<T> will have to precompute hash code and store it. Even though removal and lookup will be faster with a HashSet<T> as size grows up, addition to the end is where List<T> wins. You will have to decide which is more important to you.

  2. HashSet<T> will come up with a memory overhead compared to List<T>. See this for some illustration.

But however, from a usability standpoint it need not make sense. A HashSet<T> is a set, unlike a bag which List<T> is. They are very different, and their uses are very different. For:

  1. HashSet<T> cannot have duplicates.

  2. HashSet<T> will not care about any order.

So when you return a hybrid ICollection<T>, your requirement goes like this: "It doesn't matter whether duplicates can be added or not. Sometimes let it be added, sometimes not. Of course iteration order is not important anyway" - very rarely useful.

Good q, and +1.

Upvotes: 0

Gerald
Gerald

Reputation: 23489

In theory, it could be, depending on how many and what type of operations you are performing on the collections. In practice, it would be a pretty rare case where such micro-optimization would justify the added complexity.

Also consider what type of data you are working with. If you are using int as the collection item as the first line of your question suggests, then the threshold is going to be quite a bit less than 20 where List is no longer faster than HashSet for many operations.

In any case, if you are going to do that, I would create a new collection class to handle it, something along the lines of the HybridDictionary, and expose it to your user code with some generic interface like IDictionary.

And make sure you profile it to be sure that your use case actually benefits from it.

There may even be a better option than either of those collections, depending on what exactly it is you are doing. i.e. if you are doing a lot of "before or after" inserts and traversals, then LinkedList might work better for you.

Upvotes: 3

Dejan
Dejan

Reputation: 3084

HashSet is better, because it will probably use less space, and you will have faster access to elements.

Upvotes: -3

Servy
Servy

Reputation: 203820

If you collection is very small then the performance is virtually always going to be a non-issue. If you know that n is always less than 20, O(n) is, by definition, O(1). Everything is fast for small n.

Use the data structure that most appropriate represents how you are conceptually treating the data, the type of operations that you need to perform, and the type of operations that should be most efficient.

Upvotes: 1

alexmac
alexmac

Reputation: 19607

HashSet is for faster access, but List is for insert. If you don't plan adding new items, use HashSet, otherwise List.

Upvotes: 1

Related Questions