Reputation: 1114
For simplicity, let's say that I have two sets of words, sorted into alphabetical order. One set starts at "aardvark" and ends at "melon", and the other starts at "melon" and ends at "zebra". The word "melon" appears in both sets.
If I were to take an input word, say "banana", what would be a good (and efficient) way of determining which set of words it should belong to? Note: this isn't a question about whether the word "banana" already exists in one set, but rather a question about how to determine which set the word should exist in.
If there is an algorithm that someone knows, great. If they can provide some version in Java, even better!
Edit: Should also point out, whilst my example only has 2 sets, I want the algorithm to work with n sets.
Upvotes: 2
Views: 2692
Reputation: 12610
final int n = 99; // whatever
final SortedSet<String>[] allMySets = new SortedSet[ n ];
// put your sets into allMySets, no particular order required.
final String searchWord = "banana";
int i;
for ( i = 0; i < allMySets.length; i++ ) {
final SortedSet< String > ss = allMySets[i];
if ( searchWord.compareTo( ss.first() ) >= 0 && searchWord.compareTo( ss.last() ) <= 0 ) {
System.out.println("Word " + searchWord + " belongs to set #" + i);
break;
}
}
if ( i == allMySets.length ) {
System.out.println("No matching set found.");
// Maybe handle border case here...
}
Upvotes: 0
Reputation: 81074
Let's say you have n
sets. Construct a list of the "partition" words in sorted order.
Then the set it belongs to is simply:
List<String> partitions = Arrays.asList("melon", "strawberry");
int setIndex = -(Collections.binarySearch(partitions, "banana")) - 1;
This works because Collections.binarySearch
returns the insertion position (-1) if it cannot find the key in the list. If it might collide with one of the partition words then you should first check whether the result is negative.
I edited to remove the requirement for the "book-end" values ("aardvark" and "zebra") as they actually only complicated things.
Upvotes: 2
Reputation: 500407
For two sets:
If word
is your word (e.g. "banana"
):
int cmp = word.compareTo("melon");
if (cmp < 0) {
// it belongs to the first set
} else if (cmp > 0) {
// it belongs to the second set
} else {
// the word is "melon"
}
For n
sets:
Place the dividing words into an ArrayList<String>
(call it dividers
) in alphabetical order:
ArrayList<String> dividers = new ArrayList<String>();
//... populate `dividers` ...
Collections.sort(dividers);
Now you can use Collections.binarySearch()
to figure out which set the word belongs to:
int pos = Collections.binarySearch(dividers, word);
if (pos >= 0) {
// the word is the divider between sets `pos` and `pos+1`
} else {
int num = -(pos + 1);
// the word belong to set number `num`
}
(Here, the sets are numbered from zero.)
Upvotes: 2
Reputation: 30032
If you use a binary heap to store the lists then determining where to insert a word will take O(log n)
Upvotes: 0
Reputation: 56089
String mid = firstList.get(firstList.size()-1);
assert(mid.equals(secondList.get(0)));
if(newString.compareTo(mid) < 0) // belongs in first
else // belongs in second.
Obviously, you may need to adapt some of the method calls depending on how you're holding them.
Upvotes: 0
Reputation: 1
Just check the first letter and see if it's between (first letter of set 1) and (first letter of last element of set 1). If it's equal to both first letters, move on to the second letters. If it doesn't fit in that set move to the next set. This is BigO(n*m), where n is the number of sets and m is the number of letters in your input word. Not too bad IMO.
Upvotes: 0