Reputation: 469
The code below tries to check if all the words in searchWords
appear in newsPaperWords
. Both lists can contain duplicates. If a word appears n times in searchWords
, it'll have to appear at least n times in newsPaperWords
for the method to return true. I thought that the time complexity was 2*O(n) + O(m)
but the interviewer told me that it is 2*O(n log n) + O(m log m)
.
/**
* @param searchWords The words we're looking for. Can contain duplicates
* @param newsPaperWords The list to look into
*/
public boolean wordMatch(List<String> searchWords, List<String> newsPaperWords) {
Map<String, Integer> searchWordCount = getWordCountMap(searchWords);
Map<String, Integer> newspaperWordCount = getWordCountMap(newsPaperWords);
for (Map.Entry<String, Integer> searchEntry : searchWordCount.entrySet()) {
Integer occurrencesInNewspaper = newspaperWordCount.get(searchEntry.getKey());
if (occurrencesInNewspaper == null || occurrencesInNewspaper < searchEntry.getValue()) {
return false;
}
}
return true;
}
private Map<String, Integer> getWordCountMap(List<String> words) {
Map<String, Integer> result = new HashMap<>();
for (String word : words) {
Integer occurrencesThisWord = result.get(word);
if (occurrencesThisWord == null) {
result.put(word, 1);
} else {
result.put(word, occurrencesThisWord + 1);
}
}
return result;
}
As I see it, the time complexity of the method is 2*O(n) + O(m)
(being n the number of elements in searchWords
and m the number of elements in newsPaperWords
):
getWordCountMap()
has a complexity of O(n)
, being n the number of elements in the given list. The method loops the list once, and assuming that the calls to result.get(word)
and result.put()
are O(1)
.searchWordCount.entrySet()
is, worst-case, O(n)
, assuming, again, that calls to Hashmap.get()
are O(1)
.So, simply adding, O(n) + O(m)
to build the two maps plus O(n)
for the last look.
After reading this answer, taking O(n)
as worst-case complexity for HashMap.get()
, I could understand that the complexity of getWordCountMap()
goes up to O(n*2n)
and the final loop to O(n*n)
, which would give a total complexity of O(n*2n) + O(m*2m) + O(n*n)
.
But how is it 2*O(n log n) + O(m log m)
?
Upvotes: 2
Views: 141
Reputation: 14389
The complexity you have deduced is correct assuming the hashmap is using a proper hash function. This algorithm looks like O(m + n) to me.
I guess that your interviewer described the complexity of another approach to solving this problem which is more time-consuming but ends up taking less space.
Upvotes: 1
Reputation: 44962
Due to JEP 180: Handle Frequent HashMap Collisions with Balanced Trees the worst case for HashMap.get()
operation will be O(log n)
. To quote the JEP 180:
The principal idea is that once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n).
This would make getWordCountMap()
method O(n log n)
.
Upvotes: 3