Reputation: 659
I have a list of strings. I want to find all of the strings that start or end with another string. At its simplest an example is;
List<string> allWords = new List<string>();
for(int index = 0; index < 1000000; index++)
allWords.Add(index.ToString());
List<string> result = allWords.FindAll(x => x.StartsWith("10") || x.EndsWith("10"));
This algorithm scans the list from beginning to end. I need to perform this operation very quickly and O(n) is too slow.
What data structures (if any) are available to me to solve this algorithm faster that O(n)?
Upvotes: 0
Views: 2087
Reputation: 292455
If you have an unsorted List<string>
, there is no way to do it in less than O(n)
. However, you could use a different data structure. A trie (also called prefix tree) is particularly well suited for your need, as it has a O(m)
search complexity (where m
is the length of the searched prefix)
I have a C# implementation here : Trie.cs (actually, it's a trie-based dictionary, which associates a value with each key, but for your use-case you can just ignore the value; or if you prefer you can adapt the implementation to your needs).
Upvotes: 4
Reputation: 116785
To find strings starting with a given substring, sort the list, do a binary search to find the closest match, then scan adjacent strings to find others that also match the beginning. That's log(n)
To find strings ending with a given substring, create a list of reversed strings, and sort that list. Then to find a string that ends in a given pattern, reverse the pattern and look for reversed strings that start with the reversed pattern, as in step 1.
Upvotes: 0