Why is O(n^2) faster than O(n) in anagram mapping?

Question

Given two lists Aand B, and B is an anagram of A. B is an anagram of A means B is made by randomizing the order of the elements in A. We want to find an index mapping P, from A to B. A mapping P[i] = j means the ith element in A appears in B at index j. These lists A and B may contain duplicates.

For example, given

A = [12, 28, 46, 32, 50] B = [50, 12, 32, 46, 28] We should return [1, 4, 3, 2, 0]

My solution is which is O(n^2)

public int[] anagramMappings(int[] A, int[] B) {
    int[] result = new int[100];
    int count = 0;
    for (int i = 0; i < A.length; i++) {
        for (int j = 0; j < B.length; j++) {
            if (A[i] == B[j]) {
                result[i] = j;
                count++;
                break;
            }
        }
    }
    int[] tempArray = new int[count];
    for (int i = 0; i < count; i++) {
        tempArray[i] = result[i];
    }
    return tempArray;
}

Here's another solution which I think is probably efficient than above solution. I am saying this because I tested both the snippets with different outputs & 1st snippet almost always executed faster.

Please clarify why is 1st snippet is faster than 2nd snippet. I believe 2nd one is more efficient with O(n) complexity

public int[] anagramMappingsx(int[] A, int[] B) {
    int[] res = new int[A.length];
    int index = 0;
    Map map = new HashMap<>();
    for (int i = 0; i < B.length; i++) {
        if (!map.containsKey(B[i])) {
            map.put(B[i], i);
        }
    }
    for (Integer i : A) {
        if (map.containsKey(i)) {
            res[index++] = map.get(i);
        }
    }
    return res;
}

Rob Napier · Accepted Answer

Big-O analysis assumes that N is very large. It is about what happens to complexity as N goes to infinity. So, for instance, O(n + 100) is the same as O(n). But clearly for small N and large constants, that's not true.

In your case your input is tiny and your O(n) algorithm is using a very complex data structure that requires hashing and table lookups (plus dealing with bucket misses and all the rest). Your O(n^2) algorithm does none of that. Maps can make this cost up in the long run, but certainly not in the short-run.

As a rule, for small data sets in most languages, you should expect arrays to be the fastest data structure available, even if they force you to use an O(n^2) algorithm. It generally takes more than a few elements to recoup the cost of more complex data structures.

Arrays also tend to be faster than other data structures due to memory locality and compiler optimizations (though this depends on your language). Memory locality, reducing allocations/deallocations, and eliminating dynamic dispatch all often have as much or more impact on real-world performance than big-O complexity analysis.

It's a shame that CS programs and whiteboard interviews focus so much on big-O analysis as though it were the start and end of performance. There is much more to performance than algorithmic complexity

If you want to see your O(n) algorithm blow your O(n^2) algorithm away, try running them with 10k or 10M elements, not 5. At those scales, big-O is much more likely to dominate the situation.

Why is O(n^2) faster than O(n) in anagram mapping?

Answers (1)

Related Questions