Reputation: 1715
Recently, I met two use cases
efficiently query whether a given string word
is a substring of some elements of a collection of strings.
efficiently query whether some elements of a collection of strings are substring of a given string word
.
class StringSet {
void add(String s);
void contains(String s);
void remove(String s);
List<String> getAllSubstringsOfGivenWord(String s);
List<String> getAllStringsContainsGivenWord(String s);
}
For example, the collection of String is {"abc", "dab", "bcd"}
, then getAllStringsContainsGivenWord("ab") --> ["abc", "dab"]
, getAllSubstringsOfGivenWord("dabc") --> ["abc", "dab"]
Which kind of data structure suits for these kinds of use cases?
Upvotes: 2
Views: 160
Reputation: 4016
What level of efficiency are you looking for? How complex can the add
and remove
operations be?
Are there any constraints or limits to the problem space? eg: the number or length of the words, the max length of the substring to look for, letter-casing sensitivity, alphabets to consider.
I'd suggest taking a look into Trie, Suffix Tree, and Generalized Suffix Tree.
Upvotes: 1