Reputation: 2495
Hi Which of this time complexity is better.
O(n log m)
vs O(n + m)
Analyzing these 2 algorithms, which one is better to use practicall on a large scale?
e.g) if n and m are exponentially large it becomes
O(2e)
vs O(e*(something linear to e))
Sample Problem: Common elements in 2 arrays:
1) 2 pointer approach
2) Using Binary search
Upvotes: 12
Views: 15361
Reputation: 4062
Neither is unambiguously better, because it depends on the relative values of n
and m
.
If you assume they're equal, you have O(n log n)
vs O(n)
, so the second one (O(n + m)
) is faster. If, on the other hand, n
is effectively constant while m
grows quickly, then you're looking at O(log m)
vs O(m)
, so the first one is better.
Basically, the first is better for large m
, and the second is better for cases where they're both equally large. (If n
dominates the equation then they're both just linear.)
Upvotes: 14
Reputation: 11
In summary, it comes out to be the values of m+n versus (one of mlogn or nlogm) depending upon the lengths of the two arrays you are choosing. Though Big-O complexity analysis does not account for the "base" of logarithm, for problems like these where the values are determining better complexity, a point to note is the base of logarithm is "2" but not "10" where specific examples are taken to calculate the number of elements looked at or the number of comparisons done.
Upvotes: 1
Reputation: 134065
I think what you're asking is how best to find the common elements in two sorted arrays. You have the choice of two algorithms:
If the arrays are close to the same size, then the first algorithm, which is strictly linear, is preferable.
If one of the arrays is much larger than the other, then you might want to consider the binary search approach. You'll want to search the larger array for items that are in the smaller array. For example, if you have the arrays M and N, where M has 1,000,000 items and N has 100 items, you have a choice:
If you look for M in N, then the complexity is O(m log n). Here, m=1000000
, and log(n)=7
(approximately)
If you look for N in M, then the complexity is O(n log m). n=100
and log(m)=20
(approximately).
It's pretty clear in that case which you want to do.
In practice, the cutoff between whether you should use the O(m+n) or O(n log m) (where n is smaller than m) algorithm can only be determined empirically. It's not just a matter of figuring whether (m + n) < (n log m)
, because binary search involves a bit of overhead that the two-pointer method doesn't. It's quite likely that the two pointer method would be faster even if (m + n)
was double or triple (n log m)
.
Upvotes: 24