Reputation: 115
I'd like to start off by saying this is a little more of a general question; not one pertaining to the specific examples that I have given, but simply a conceptual topic.
Example #1: I'm creating a truly random string with UUID.java. Let's say I never want to have the same UUID generated, ever. Here's an idea of the circumstance: (Let's assume that I'm saving/loading the List at the top- that's not the point)
Gist URL (I'm new to StackExchange- sorry!)
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;
public class Example {
/**
* A final List<String> of all previous UUIDs generated with
* generateUniqueID(), turned into a string with uuid.toString();
*/
private static final List<String> PREVIOUS = new ArrayList<String>();
/**
* Generates a truly unique UUID.
*
* @param previous
* A List<String> of previous UUIDs, converted into a string with
* uuid.toString();
* @return a UUID generated with UUID.randomUUID(); that is not included in
* the given List<String>.
*/
public static UUID generateUniqueID(List<String> previous) {
UUID u = UUID.randomUUID();
if (previous.contains(u.toString())) {
return generateUniqueID(previous);
}
return u;
}
/**
* Generates a truly unique UUID using the final List<String> PREVIOUS
* variable defined at the top of the class.
*
* @return A truly random UUID created with generateUniqueID(List<String>
* previous);
*/
public static UUID generateUniqueID() {
UUID u = generateUniqueID(PREVIOUS);
PREVIOUS.add(u.toString());
return u;
}
}
Example #2: Okay, maybe UUID was a bad example, so let's use Random and a double. Here's another example:
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
public class Example2 {
/**
* A final List<Double> of all previous double generated with
* generateUniqueDouble(), turned into a string with Double.valueOf(d);
*/
private static final List<Double> PREVIOUS = new ArrayList<Double>();
/**
* The RANDOM variable used in the class.
*/
private static final Random RANDOM = new Random();
/**
* Generates a truly unique double.
*
* @param previous
* A List<Double> of previous doubles, converted into a Double
* with Double.valueOf(d);
* @return a UUID generated with UUID.randomUUID(); that is not included in
* the given List<Double>.
*/
public static double generateUniqueDouble(List<Double> previous) {
double d = RANDOM.nextDouble();
if (previous.contains(Double.valueOf(d))) {
return generateUniqueDouble(previous);
}
return d;
}
/**
* Generates a truly unique double using the final List<Double> PREVIOUS
* variable defined at the top of the class.
*
* @return A truly random double created with generateUnique(List<Double>
* previous);
*/
public static double generateUnique() {
double d = RANDOM.nextDouble();
PREVIOUS.add(Double.valueOf(d));
return d;
}
}
The point: Is this the most efficient method of doing something like this? Keep in mind I gave you examples, so they're pretty vague. Preferrably I wouldn't like to use any libraries for this, but if they really are a substantial difference in efficency please let me know about them.
Please let me know what you think in the responses :)
Upvotes: 1
Views: 492
Reputation: 54709
Some points have already been discussed in the comments. To summarize and elaborate them here:
It is very unlikely that you create the same double
value twice. There are roughly 7*1012 different double
values (assuming that the random number generator can deliver "most" of them). For the UUIDs, the chance of creating the same value twice is even lower, since there are 2122 different UUIDs. If you created enough elements to have a non-negligible chance for a collision, you'd run out of memory anyhow.
So this approach does not make sense in practice.
However, from a purely theoretical point of view:
Using a List
for this operation is not optimal. The "best case" (and by far the most common case) for you is that the new element is not contained in the list. But for the check whether the element is contained, this is the worst case: You'll have to check each and every element of the list, only to detect that the new element was not yet present. This is said to be linear complexity, or for short, O(n). You could use a different data structure where checking whether an element is contained can be done more quickly, namely in O(1). For example, you could replace the line
private static final List<Double> PREVIOUS = new ArrayList<Double>();
with
private static final Set<Double> PREVIOUS = new HashSet<Double>();
(referring to the recursive approach in general here)
Performance
From a performance point of view, you should not use recursion when it can easily be replaced by an iterative solution. In this case, this would be trivial:
public static double generateUniqueDouble(List<Double> previous) {
double d = RANDOM.nextDouble();
while (previous.contains(d)) {
d = RANDOM.nextDouble();
}
PREVIOUS.add(d);
return d;
}
(it could be written a bit more compact, but that does not matter now).
Correctness
This is more subtle: When there are many recursive calls, then you might end up with a StackOverflowError
. So you should never use recursion unless you can prove that the recursion will end (or better: That it will end "after a few steps").
But here's your main problem:
The algorithm is flawed. You cannot prove that it will be able to create a new random number. The chance that even a single new element is already contained in the collection of PREVIOUS
elements is ridiculously low for double
(or UUID
) values. But it is not zero. And there is nothing preventing the random number generator from creating the random number 0.5
indefinitely, trillions of times in a row.
(Again: These are purely theoretical considerations. But not as far away from practice as they might look at the first glance: If you did not create random double
values, but random byte
values, then, after 256 calls, there would be no "new" values to return - and you would actually receive the StackOverflowError
...)
Upvotes: 1
Reputation: 5103
I suggest you make the generated IDs sequential numbers instead of doubles or uuids. If you want them to appear random to end users, display the sha1 of the number in base64.
Upvotes: 6
Reputation: 5949
It would be better to use a hash table than a list. Generate your candidate value, check for a collision in the hash table, and accept it if there is no collision. If you use a list, generating a new value is an O(n)
operation. If you use a hash table, generating a new value is an O(1)
operation .
Upvotes: 0