Reputation: 1
Could someone suggest the algorithm to find the occurrence of any word from the set of K words in a string?
For Example :
Set of Words: {abc,xyz}
String : abcdefghiabcjklabxyz
Output : {0,9,17} //Starting positions of words in the string
Something better than running KMP K times !!!
Upvotes: 0
Views: 96
Reputation: 80287
Aho-Corasick algorithm is intended to search for any word from given dictionary in the text.
There are some other algorithms for this task - Commentz-Walter , Rabin–Karp (but Aho-Corasic one has better complexity for the worst case)
Upvotes: 1
Reputation: 6404
If you want to do this on an industrial scale, use a suffix tree. You store every suffix in the string, and you can then search for sub-strings in essentially O constant time, because all the sub strings are the same string with different suffixes.
However the strings need to be very long before a suffix tree justifies the complexity (they are used in reality for scanning DNA sequence data amongst other things).
Upvotes: 0