Reputation: 613
Let's say I have a book title and I search for it in a database. The database produces matches, some of which are full matches and some of which are partial matches.
A full match
is when every word in the search result is represented by a word in the search terms. (i.e. there does not have to be a complete overlap on both sides)
I am only concerned with finding the full matches.
So if I type a search for "Ernest Hemingway - The Old Man and the Sea"
and the results return the following:
Charles Nordhoff - Men Against The Sea
Rodman Philbrick - The Young Man and the Sea
Ernest Hemingway - The Old Man and the Sea
Ernest Hemingway - The Sun Also Rises
Ernest Hemingway - A Farewell to Arms
Ernest Hemingway - For Whom the Bell Tolls
Ernest Hemingway - A Moveable Feast
Ernest Hemingway - True at First Light
Men Against The Sea
The Old Man and the Sea
The Old Man and the Sea Dog
There are TWO full matches
in this list: (according to the definition above)
Ernest Hemingway - The Old Man and the Sea
The Old Man and the Sea
To do this in Java, assume I have two variables:
String searchTerms;
List<String> searchResults;
searchTerms
in the example above represents what I typed in: Ernest Hemingway - The Old Man and the Sea
searchResults
represents the list of strings I got back from the database above.
for (String result : searchResults) {
// How to check for a full match?
// (each word in `result` is found in `searchTerms`
}
My question is: in this for-loop
, how do I check whether every word in the result
String has a corresponding word in the searchTerms
String?
Upvotes: 1
Views: 647
Reputation: 1167
To find the full match as you have defined it, you want to test that a set of tokens contains a particular subset. You can do this easily using a Set which you get for free in the collections libraries. One way to do this would be (the expense of regexes aside):
Set<String> searchTerms = new HashSet<String>();
Set<String> resultTokens = new HashSet<String>();
searchTerms.addAll( Arrays.asList( searchString.split( "\\s+" ) );
for ( String result : searchResults )
{
resultTokens.clear();
resultTokens.addAll( Arrays.asList( result.split( "\\s+" ) ) );
if ( resultTokens.containsAll( searchTerms ) )
{
// Perform match code
}
}
Alternatively, if you wanted to be stricter about it, you could test for set equality using resultTokens.equals( searchTerms ). In your example, this would narrow the result set to "Ernest Hemingway - The Old Man and the Sea"
Upvotes: 3
Reputation: 8125
Assuming your database result is accurate,
Split up result
into tokens (words) using String.split(String delimiter)
and see whether each token is found in searchTerms
(using searchTerms.indexOf(String word) == -1
).
for (String result : searchResults) {
for(String word : result) {
if(searchTerms.indexOf(word) == -1) {
// result is not a full match
}
}
//If none of the if statements executed, statement is a full match.
}
Upvotes: 1