Dan_Dan_Man
Dan_Dan_Man

Reputation: 504

Comparing two Strings in java and identifying duplicate words

I'm trying be able to compare two Strings and identify duplicate words. For example;

String1 = "Hello, my name is John."
String2 = "Can you tell me your name please?"

Comparing String1 and String2 would return the word; "name".

I know it is possible to split these two strings into an array of words, and then iterate over each word of each String in a 2-D array. However this is computationally expensive at O(n^2) and I was wondering if there is a faster way of doing this?

Thanks.

EDIT: Changed the example for clarity.

Upvotes: 7

Views: 4089

Answers (2)

Faruk Sahin
Faruk Sahin

Reputation: 8756

After getting the strings to word arrays:

You can add all the elements in the first array to a hashmap and then scan the second array to see if each of the elements exists in the hashmap. Since access time to a hashmap is O(1), this will be O(n+m) time complexity.

If you do not want to use extra space, you can sort both of the arrays in O(nlogn) and then compare the items in O(n+m) which would give you O(nlogn) in total.

Upvotes: 13

Alex
Alex

Reputation: 25613

One simple solution is to use the Sets.intersection method of Guava's Sets. It is pretty easy:

String s1 = "Hello, my name is John.";
String s2 = "Can you tell me your name?";
Splitter splitter = Splitter.onPattern("\\W").trimResults().omitEmptyStrings();
Set<String> intersection = Sets.intersection(//
        Sets.newHashSet(splitter.split(s1)), //
        Sets.newHashSet(splitter.split(s2)));
System.out.println(intersection);

Output:

[name]

You can also find more information on algorithms to detect Set intersection on this thread.

Upvotes: 6

Related Questions