Vinay Sharma
Vinay Sharma

Reputation: 1

Occurrence of a word from a set of words in a string

Could someone suggest the algorithm to find the occurrence of any word from the set of K words in a string?
For Example :
Set of Words: {abc,xyz}
String : abcdefghiabcjklabxyz
Output : {0,9,17} //Starting positions of words in the string

Something better than running KMP K times !!!

Upvotes: 0

Views: 96

Answers (2)

MBo
MBo

Reputation: 80287

Aho-Corasick algorithm is intended to search for any word from given dictionary in the text.

There are some other algorithms for this task - Commentz-Walter , Rabin–Karp (but Aho-Corasic one has better complexity for the worst case)

Upvotes: 1

Malcolm McLean
Malcolm McLean

Reputation: 6404

If you want to do this on an industrial scale, use a suffix tree. You store every suffix in the string, and you can then search for sub-strings in essentially O constant time, because all the sub strings are the same string with different suffixes.

However the strings need to be very long before a suffix tree justifies the complexity (they are used in reality for scanning DNA sequence data amongst other things).

Upvotes: 0

Related Questions