antonro
antonro

Reputation: 469

Why is this method's time complexity 2*O(n log n) + O(m log m)?

The code below tries to check if all the words in searchWords appear in newsPaperWords. Both lists can contain duplicates. If a word appears n times in searchWords, it'll have to appear at least n times in newsPaperWords for the method to return true. I thought that the time complexity was 2*O(n) + O(m) but the interviewer told me that it is 2*O(n log n) + O(m log m).

/**
 * @param searchWords The words we're looking for. Can contain duplicates
 * @param newsPaperWords  The list to look into
 */
public boolean wordMatch(List<String> searchWords, List<String> newsPaperWords) {
    Map<String, Integer> searchWordCount = getWordCountMap(searchWords);
    Map<String, Integer> newspaperWordCount = getWordCountMap(newsPaperWords);
    for (Map.Entry<String, Integer> searchEntry : searchWordCount.entrySet()) {
        Integer occurrencesInNewspaper = newspaperWordCount.get(searchEntry.getKey());
        if (occurrencesInNewspaper == null || occurrencesInNewspaper < searchEntry.getValue()) {
            return false;
        }
    }
    return true;
}

private Map<String, Integer> getWordCountMap(List<String> words) {
    Map<String, Integer> result = new HashMap<>();
    for (String word : words) {
        Integer occurrencesThisWord = result.get(word);
        if (occurrencesThisWord == null) {
            result.put(word, 1);
        } else {
            result.put(word, occurrencesThisWord + 1);
        }
    }
    return result;
}

As I see it, the time complexity of the method is 2*O(n) + O(m) (being n the number of elements in searchWords and m the number of elements in newsPaperWords):

So, simply adding, O(n) + O(m) to build the two maps plus O(n) for the last look.

After reading this answer, taking O(n) as worst-case complexity for HashMap.get(), I could understand that the complexity of getWordCountMap() goes up to O(n*2n) and the final loop to O(n*n), which would give a total complexity of O(n*2n) + O(m*2m) + O(n*n).

But how is it 2*O(n log n) + O(m log m)?

Upvotes: 2

Views: 141

Answers (2)

displayName
displayName

Reputation: 14389

The complexity you have deduced is correct assuming the hashmap is using a proper hash function. This algorithm looks like O(m + n) to me.


I guess that your interviewer described the complexity of another approach to solving this problem which is more time-consuming but ends up taking less space.

Upvotes: 1

Karol Dowbecki
Karol Dowbecki

Reputation: 44962

Due to JEP 180: Handle Frequent HashMap Collisions with Balanced Trees the worst case for HashMap.get() operation will be O(log n). To quote the JEP 180:

The principal idea is that once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n).

This would make getWordCountMap() method O(n log n).

Upvotes: 3

Related Questions