Jonathan Wood
Jonathan Wood

Reputation: 67345

Best Collection for Fast String Lookup

I need a list of strings and a way to quickly determine if a string is contained within that list.

To enhance lookup speed, I considered SortedList and Dictionary; however, both work with KeyValuePairs when all I need is a single string.

I know I could use a KeyValuePair and simply ignore the Value portion. But I do prefer to be efficient and am just wondering if there is a collection better suited to my requirements.

Upvotes: 29

Views: 22105

Answers (7)

Artur Krajewski
Artur Krajewski

Reputation: 589

I know the question is old as hell, but I just had to solve the same problem, only for a very small set of strings(between 2 and 4).

In my case, I actually used manual lookup over an array of strings which turned up to be much faster than HashSet<string>(I benchmarked it).

for (int i = 0; i < this.propertiesToIgnore.Length; i++)
{
    if (this.propertiesToIgnore[i].Equals(propertyName))
    {
        return true;
    }
}

Note, that it is better than hash set for only for tiny arrays!

EDIT: works only with a manual for loop, do not use LINQ, details in comments

Upvotes: 1

BrokenGlass
BrokenGlass

Reputation: 161012

If you just want to know if a string is in the set use HashSet<string>

Upvotes: 10

user3810913
user3810913

Reputation:

I know this answer is a bit late to this party, but I was running into an issue where our systems were running slow. After profiling we found out there was a LOT of string lookups happening with the way we had our data structures structured.

So we did some research, came across these benchmarks, did our own tests, and have switched over to using SortedList now.

if (sortedlist.ContainsKey(thekey))
{   
//found it.
}

Even though a Dictionary proved to be faster, it was less code we had to refactor, and the performance increase was good enough for us.

Anyway, wanted to share the website in case other people are running into similar issues. They do comparisons between data structures where the string you're looking for is a "key" (like HashTable, Dictionary, etc) or in a "value" (List, Array, or in a Dictionary, etc) which is where ours are stored.

Upvotes: 1

Henk Holterman
Henk Holterman

Reputation: 273784

This sounds like a job for

 var keys = new HashSet<string>();

Per MSDN: The Contains function has O(1) complexity.

But you should be aware that it does not give an error for duplicates when adding.

Upvotes: 6

James
James

Reputation: 9288

If you feel like rolling your own data structure, use a Trie. http://en.wikipedia.org/wiki/Trie

worst-case is if the string is present: O(length of string)

Upvotes: 1

Albin Sunnanbo
Albin Sunnanbo

Reputation: 47068

HashSet<string> is like a Dictionary, but with only keys.

Upvotes: 3

Jon Skeet
Jon Skeet

Reputation: 1503519

If you're on .NET 3.5 or higher, use HashSet<String>.

Failing that, a Dictionary<string, byte> (or whatever type you want for the TValue type parameter) would be faster than a SortedList if you have a lot of entries - the latter will use a binary search, so it'll be O(log n) lookup, instead of O(1).

Upvotes: 38

Related Questions