Reputation: 8477
I have a rather large int[] which is sorted using Arrays.sort()
.. I need to remove the duplicate elements from the array.
This question originates from sedgewick's Algorithms book 1.1.28
1.1.28 Remove duplicates. Modify the test client in BinarySearch to remove any du- plicate keys in the whitelist after the sort.
I tried to create a noDupes() method which takes in an int[] and returns an int[] with duplicates removed
The rank() methods are from sedgewick's code.which does the binary search
public static int[] noDupes(int[] a){
Arrays.sort(a);
int maxval= a[a.length-1];
int[] nodupes = new int[maxval];
int i=0;
for(int j=0;j<a.length;j++){
int rnk = rank(a[j],nodupes);
System.out.println(a[j]+" rank="+rnk);
if (rnk < 0){
System.out.println(a[j]+" is not dupe");
nodupes[i] = a[j];
i++;
}
}
return nodupes;
}
public static int rank(int key,int[] a){
return rank(key,a,0,a.length-1);
}
public static int rank(int key,int[] a,int lo,int hi){
if(lo > hi) return -1;
int mid = lo+(hi-lo)/2;
if(key < a[mid])return rank(key,a,0,mid-1);
else if(key > a[mid])return rank(key,a,mid+1,hi);
else return mid;
}
When I ran this with a sample array
int[] a =new int[]{2,2,2,3,4,4,5,6};
int[] ret = noDupes(a);
I am getting some unexpected output..even after 2 is added to the nodupes array, the rank for an existing element is -1..
2 rank=-1
2 is not dupe
2 rank=-1
2 is not dupe
2 rank=-1
2 is not dupe
3 rank=-1
3 is not dupe
4 rank=-1
4 is not dupe
4 rank=4
5 rank=-1
5 is not dupe
6 rank=-1
6 is not dupe
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 6
at ...noDupes(BinSearch.java:85)
at ...main(BinSearch.java:96)
I couldn't figure out what I am doing wrong..can someone help?
Upvotes: 0
Views: 2881
Reputation: 30819
This code will help you.
public Integer[] removeDuplicates(Integer[] input){
Integer[] arrayWithoutDuplicates = null;
Set<Integer> set = new LinkedHashSet<Integer>();
for(int i : input){
set.add(i);
}
arrayWithoutDuplicates = (Integer[]) set.toArray();
return arrayWithoutDuplicates;
}
Upvotes: 0
Reputation: 485
If you have your array sorted and if you want to remove duplicates I think you don't need to use binary search for that.
when you sort an array, duplicate elements will be adjacent to each other.
E.g. Array = {9,8,9,1,2,5,2,5,1} After sorting Array = {1,1,2,2,5,5,8,9,9}
You can use following way to remove duplicates (inplace)
int a[] = {sorted array}
for(int i=0,target=0;i<a.length-1;i++) {
if(a[i]!=a[i+1]) {
a[target++] = a[i];
}
}
a[target++] = a[a.length-1];
for(int i=target;i<a.length;i++) {
a[i] = 0; // fill in the values which you don't want.
}
will remove duplicates in one pass only
Upvotes: 2
Reputation: 1967
This should help:
int[] nodupes = new int[a.length];
nodupes array is going out of bound.
Note: I am not sure if logic that you are using is the best for the problem. But this should solve your exception.
Upvotes: 0
Reputation: 136042
I would do it this way
public static int[] noDupes(int[] a) {
Arrays.sort(a);
int noDupCount = 0;
for (int i = 0; i < a.length; i++) {
if (i == 0 || a[i] != a[i - 1]) {
noDupCount++;
}
}
int[] a2 = new int[noDupCount];
for (int i = 0, j = 0; i < a.length; i++) {
if (i == 0 || a[i] != a[i - 1]) {
a2[j++] = a[i];
}
}
return a2;
}
Upvotes: 3
Reputation: 2102
just add all the array values to the HashSet it will automatically remove duplicates and gives you unique values and then again convert it to array that you required
Upvotes: 3