Reputation: 3260
An identical pair in array are 2 indices p,q
such that
0<=p<q<N
and array[p]=array[q]
where N
is the length of the array.
Given an unsorted array, find the number identical pairs in the array.
My solution was to sort the array by values, keeping track of indices.
Then for every index p
in sorted array, count all q<N
such that and
sortedarray[p].index < sortedarray[q].index and
sortedarray[p] = sortedarray[q]
Is this the correct approach. I think the complexity would be
O(N log N) for sorting based on value +
O(N^2) for counting the newsorted array that satisfies the condition.
This means I am still looking at O(N^2)
. Is there a better way ?
Another thought that came was for every P binary search the sorted array for all Q that satisfies the condition. Would that not reduce the complexity of the second part to O(Nlog(N))
Here is my code for second part
for(int i=0;i<N;i++){
int j=i+1;
while( j<N && sortedArray[j].index > sortedArray[i].index &&
sortedArray[j].item == sortedArray[i].item){
inversion++;
j++;
}
}
return inversion;
@Edit: I think, I mistook the complexity of second part to be O(N^2)
.
As in every iteration in while loop, no rescan of elements from indices 0-i occurs, linear time is required for scanning the sorted array to count the inversions. The total complexity is therefore
O(NlogN)
for sorting and O(N)
for linear scan count in sorted array.
Upvotes: 1
Views: 3636
Reputation: 199
I've been thinking about this.... I think that if you "embed" the ==
condition into your sorting algorithm, then, the complexity is still O(n lg n).
Upvotes: 0
Reputation: 7996
If you can allocate more memory you can get some gains.
You can reach O(n)
by using a hash table which maps any values in the array to a counter indicating how often you already saw this value.
If the number of allowed values is integral and in a limited range you can directly use an array instead of a hash table. The index of value i
being i
itself. In that case the complexity would be O(n+m)
where m
is the number of allowed values (because you must first set to 0
all entries in the array and then look through all the array entries to count pairs).
Both methods gives you the number of identical values for each values in your array. Let's call this number nv_i
the number of appearance of the value i
in the array. Then the number of pairs of value i
is: (nv_i)*(nv_i-1)/2
.
You can pair:
1st i with nv_i-1 others
2nd i with nv_i-2 others
...
last i with 0
And (nv_i-1)+(nv_i-2)+...+0 = (nv_i)*(nv_i-1)/2
Upvotes: 0
Reputation: 3988
As Tim points out in his response, the complexity of finding the pairs within a sorted array is O(n)
and not O(n^2)
.
To convince yourself of this, think about a typical O(n^2)
algorithm: Insertion Sort.
An animated example can be found here.
As you can see in the gif, the reason why this algorithm is quadratic, is because, for each element, it has to check the whole array to ensure where such element will have to go (this includes previous elements in the array!).
On the hand, in you case, you have an ordered array: e.g. [0,1,3,3,6,7,7,9,10,10]
In this situation, you will start scanning (pairwise) from the beginning, and (because of the fact that the array is ordered) you know that once an element is scanned and you pointers proceed, there cannot be any reason to rescan previous elements in the future, because otherwise you would have not proceeded in the first place.
Hence, you scan the whole array only once: O(n)
Upvotes: 1
Reputation: 521194
You are partially correct. Sorting the array via Merge Sort or Heapsort will take O(n lg n)
. But once the array is sorted, you can make a single pass through to find all identical pairs. This single pass is an O(n)
operation. So the total complexity is:
O(n lg n + n) = O(n lg n)
Upvotes: 3