Reputation: 374
A question that has me speculating is the following:
Let's say we have a sorted array with the numbers {1,1,1,1,2,2,4,4,4}.
Now, given that we can clearly see that we have six pairs on 1's, one pair of 2's and three pairs of 4's (10 pairs). How would you construct an algorithm that finds these pairs in O(n)?
I have an algorithm that counts the pairs in an array and does so like this:
Arrays.sort(array);
int counter = 0;
for(int i = 0; i<array.length; i++) {
for(int j = i+1; j<array.length; j++) {
if(j!=i && array[j] == array[i]) {
counter++;
}
}
}
return counter;
But this runs in O(N^2) and I guess (being the novice that I am with algorithms) that there's a better solution to this to get the same results using only one for-loop or multiple while-loops?
I'd like to hear your thoughts!
Upvotes: 6
Views: 2373
Reputation: 31
Another way of doing it in O(n)
. You can count all the pairs as you scan along the array.
int countPairs(vector<int> arr) {
int count = 0;
int i = 0;
while (i < arr.size()) {
int j = i + 1;
while (j < arr.size() && arr[j] == arr[i]) {
count += j - i;
j++;
}
i = j;
}
return count;
}
Runnable at - https://ideone.com/TFoZdv
Upvotes: 0
Reputation: 1004
This is my way of doing this. Hope it will help somebody :)
static int foo(int[] ar) {
int count = 0;
Arrays.sort(ar);
for(int i = 0; i<ar.length-1;i++)
{
if(ar[i] == ar[i+1])
{
count ++;
i++;
}
}
return count;
}
Upvotes: 1
Reputation: 9117
You can do it in O(N)
:
int pairs = 0;
int times = 1;
for (int i = 1; i < array.length; ++i) {
if (array[i] == array[i-1]) {
++times;
} else {
pairs += times*(times-1) / 2;
times = 1;
}
}
pairs += times*(times-1) / 2;
return pairs;
Runnable: https://ideone.com/mu65ie
For every distinct number, count the number of its occurrences times
. The number of different pairs is equal to number of choices C(times, 2) = times*(times-1) / 2
.
Upvotes: 4
Reputation: 1794
The secret is to stop reiterating. Cache data as it appears. You can use caching to reduce this into an O(nlogn) problem.
Pairs is very vague wording, so in the future a few more examples will clarify things you don't know a name to find answers for. You can use mathematics of Combinations to reduce the problem into O(n).
The wikipedia article is a bit obtuse to read, but the equation you're looking for is near the top:
n! / (k! * (n - k)!)
where !
indicates a factorial number, n
indicates the amount of items to be combined (4 1s), and k
indicates the amount of items per combination (2 for a pair). So substituting these values:
4! = 24
2! = 2
(4-2)! = 2
4!/(2!2!) = 24/4 = 6
Using this equation it can be reduced to O(n). Since factorials are used and the dataset is sorted, you can further improve performance by caching the result of the factorial call for future calls. Sorted input for a factorial function will have cache hits for nearly every lookup!
Caching may not be necessary if you are using python 3 as it uses a much more efficient algorithm to calculate factorials than python 2. Caching will reduce redundancy, however which may yield a good result on very large values.
An example of caching (memoization):
import math
class Cache(object):
memos = dict()
def factorial(self, n):
if not n in self.memos:
self.memos[n] = math.factorial(n)
return self.memos[n]
Upvotes: 2
Reputation: 9650
How about:
Arrays.sort(array);
int counter = 0;
for(int i = 1; i<array.length; i++) {
if(array[i] == array[i-1]) {
counter++;
++i;
}
}
return counter;
Upvotes: 1
Reputation: 3947
Okay so here's also my solution:
int i = 0;
int pairs = 0;
while (i < array.length - 1) {
if(array[i] == array[i + 1]) {
pairs += 1;
i += 2;
} else {
i++;
}
}
When a pair is found the index is incremented by two, this makes traversal a little bit faster. But the complexity is O(n)
anyway.
Of course you run this after the array is sorted
.
Upvotes: 2