Reputation: 25
I'm taking an introductory course to Java and one of my latest projects involve making sure an array doesn't contain any duplicate elements (has distinct elements). I used a for loop with an inner for loop, and it works, but I've heard that you should try to avoid using many iterations in a program (and other methods in my classes have a fair number of iterations as well). Is there any efficient alternative to this code? I'm not asking for code of course, just "concepts." Would there potentially be a recursive way to do this? Thanks!
The array sizes are generally <= 10.
/** Iterates through a String array ARRAY to see if each element in ARRAY is
* distinct. Returns false if ARRAY contains duplicates. */
boolean distinctElements(String[] array) { //Efficient?
for (int i = 0; i < array.length; i += 1) {
for (int j = i + 1; j < array.length; j += 1) {
if (array[i] == array[j]) {
return false;
}
}
} return true;
}
Upvotes: 0
Views: 6787
Reputation: 140299
"Efficiency" is almost always a trade-off. Occasionally, there are algorithms that are simply better than others, but often they are only better in certain circumstances.
For example, this code above: it's got time complexity O(n^2)
.
One improvement might be to sort the strings: you can then compare the strings by comparing if an element is equal to its neighbours. The time complexity here is reduced to O(n log n)
, because of the sorting, which dominates the linear comparison of elements.
However - what if you don't want to change the elements of the array - for instance, some other bit of your code relies on them being in their original order - now you also have to copy the array and then sort it, and then look for duplicates. This doesn't increase the overall time or storage complexity, but it does increase the overall time and storage, since more work is being done and more memory is required.
Big-oh notation only gives you a bound on the time ignoring multiplicative factors. Maybe you only have access to a really slow sorting algorithm: actually, it turns out to be faster just to use your O(n^2)
loops, because then you don't have to invoke the very slow sort.
This could be the case when you have very small inputs. An oft-cited example of an algorithm that has poor time complexity but actually is useful in practice is Bubble Sort: it's O(n^2)
in the worst case, but if you have a small and/or nearly-sorted array, it can actually be pretty darn fast, and pretty darn simple to implement - never forget the inefficiency of you having to write and debug the code, and to have to ask questions on SO when it doesn't work as you expect.
What if you know that the elements are already sorted, because you know something about their source. Now you can simply iterate through the array, comparing neighbours, and the time complexity is now O(n)
. I can't remember where I read it, but I once saw a blog post saying (I paraphrase):
A given computer can never be made to go quicker; it can only ever do less work.
If you can exploit some property to do less work, that improves your efficiency.
So, efficiency is a subjective criterion:
Upvotes: 2
Reputation: 201399
First, array[i] == array[j]
tests reference equality. That's not how you test String
(s) for value equality.
I would add each element to a Set
. If any element isn't successfully added (because it's a duplicate), Set.add(E)
returns false
. Something like,
static boolean distinctElements(String[] array) {
Set<String> set = new HashSet<>();
for (String str : array) {
if (!set.add(str)) {
return false;
}
}
return true;
}
You could render the above without a short-circuit like
static boolean distinctElements(String[] array) {
Set<String> set = new HashSet<>(Arrays.asList(array));
return set.size() == array.length;
}
Upvotes: 1