nobody
nobody

Reputation: 2759

What is the most efficient collection class in C# for string search

string[] words = System.IO.File.ReadAllLines("word.txt");
var query = from word in words
            where word.Length > "abe".Length && word.StartsWith("abe")
            select word;
foreach (var w in query.AsParallel())
{
    Console.WriteLine(w);
}

Basically the word.txt contains 170000 English words. Is there a collection class in C# that is faster than array of string for the above query? There will be no insert or delete, just search if a string starts with "abe" or "abdi".

Each word in the file is unique.

EDIT 1 This search will be performed potentially millions of times in my application. Also I want to stick with LINQ for collection query because I might need to use aggregate function.

EDIT 2 The words from the file are sorted already, the file will not change

Upvotes: 6

Views: 940

Answers (3)

szogun1987
szogun1987

Reputation: 712

If you search much often than you change a file with words. You can sort words in file every time you change list. After this you can use bisectional search. So you will have to make up to 20 comparisons to find any word witch match with your key and some additional comparisons of neighborhood.

Upvotes: 0

Eugen
Eugen

Reputation: 2990

myself I'd create a Dictionary<char, List<string>>, where I'd group words by their first letter. This will reduce substantially the lookup of needed word.

Upvotes: 4

Alexei Levenkov
Alexei Levenkov

Reputation: 100620

If you need to do search once there is nothing better than linear search - array is perfectly fine for it.

If you need to perform repeated searches you can consider soring the array (n Log n) and search by any prefix will be fast (long n). Depending on type of search using dictionary of string lists indexed by prefix may be another good option.

Upvotes: 1

Related Questions