Reputation: 504
I'm trying be able to compare two Strings and identify duplicate words. For example;
String1 = "Hello, my name is John."
String2 = "Can you tell me your name please?"
Comparing String1 and String2 would return the word; "name".
I know it is possible to split these two strings into an array of words, and then iterate over each word of each String in a 2-D array. However this is computationally expensive at O(n^2) and I was wondering if there is a faster way of doing this?
Thanks.
EDIT: Changed the example for clarity.
Upvotes: 7
Views: 4089
Reputation: 8756
After getting the strings to word arrays:
You can add all the elements in the first array to a hashmap and then scan the second array to see if each of the elements exists in the hashmap. Since access time to a hashmap is O(1), this will be O(n+m) time complexity.
If you do not want to use extra space, you can sort both of the arrays in O(nlogn) and then compare the items in O(n+m) which would give you O(nlogn) in total.
Upvotes: 13
Reputation: 25613
One simple solution is to use the Sets.intersection
method of Guava's Sets
. It is pretty easy:
String s1 = "Hello, my name is John.";
String s2 = "Can you tell me your name?";
Splitter splitter = Splitter.onPattern("\\W").trimResults().omitEmptyStrings();
Set<String> intersection = Sets.intersection(//
Sets.newHashSet(splitter.split(s1)), //
Sets.newHashSet(splitter.split(s2)));
System.out.println(intersection);
Output:
[name]
You can also find more information on algorithms to detect Set intersection on this thread.
Upvotes: 6