Guilherme Schults
Guilherme Schults

Reputation: 53

Problem to generate random unique numbers from fixed population length

I have a problem here which is: I need to generate random numbers given a fixed length, and every time I generate those numbers, I need to check if it was already seen or not.

Example: my fixed population size is 2.000.000. So, for example, in the first round of my algorithm, my sample size is 400.000. I need to generate 400.000 over 2.000.000. After generating those random numbers, I save them in a HashSet.

In a second rand of my algorithm, let's say I want to generate 20.000 random numbers, but I need to check with those 20.000 numbers was already seen or not by looking at the HashSet (which contains the 400.000 initial numbers from the 1 round).

This is what I got so far:

1 round: 
         population size: 2.000.000
         sample: 400.000
         List<Integer> result = new ArrayList<Integer>(sample);

         I save the numbers in a variable called perm[],so :
         int perm[] = new int[population]

  public List<Integer> generateRandomNumbers (int population, Set<Integer> setListStringSeen, int sample)
  {
   for (int i = 0; i < sample; i++)  
   {
      // random integer between i and population-i
      k = i + (int) (Math.random() * (population - i));
                    
     if(setListStringSeen.contains(k))
     {    
         // the problem here is: when I check here and if the newly generated number
         // was already see, I need to generate again a new number. But in this case,
         // the next number need to be checked again, because it could be seen too.
         // how can I end up this loop of checking?

        k = i + (int) (Math.random() * (population - i));
        
        if(setListStringSeen.contains(k))
        {
            System.out.println("we've choose this number once before");
        }
        
        setListStringSeen.add(k);           
        
      }
    
      int t = perm[k];
      perm[k] = perm[i];
      perm[i] = t;
    
   }

   for (int i = 0; i < sample; i++)
   {
      result.add(perm[i]);
   }

    at the end of 1 round, I add all the generated numbers in a HashSet:

   setListStringSeen.addAll(result);

   return result;

}

Now let's go to the 2 round: let's say we want to generate 20.000 new numbers: what I want is, check if those numbers the will be generated (in the second round) was already seen before by checking the Hashset variable. Any idea on how to do it?

Upvotes: 0

Views: 109

Answers (3)

Pearl
Pearl

Reputation: 452

You can use:

while (set.add(random.nextInt(2000000)) != true);

to add it to the set and it will add it uniquely

Another option could be to create a total sample set in class scope of 2mil and then shuffle it and just pull from the list so you never get the same number twice:

List<Integer> sample = IntStream.rangeClosed(0, 2000000)
    .boxed().collect(Collectors.toList());
Collections.shuffle(sample)

Upvotes: 1

Eritrean
Eritrean

Reputation: 16498

If you are using Java 8 or higher, you could do something like below:

public static void main(String args[]) {
    Random rand = new Random();

    int populationSize =  20;
    int sampleSizeFirstRound =  10;

    Set<Integer> sample = rand.ints(1,populationSize)
            .distinct()
            .limit(sampleSizeFirstRound)
            .boxed()
            .collect(Collectors.toSet());

    int sampleSizeSecondRound =  6;
    Set<Integer> sampleSecondRound = rand.ints(1,populationSize)
            .distinct()
            .boxed()
            .filter(i -> !sample.contains(i))
            .limit(sampleSizeSecondRound)
            .collect(Collectors.toSet());

    System.out.println(sample);
    System.out.println(sampleSecondRound);
}

To make it more manageable I have kept the sizes of the samples small. Adapt them as needed.

Upvotes: 1

Eddie Lopez
Eddie Lopez

Reputation: 1139

You should generate the random numbers beforehand, so you are certain they are not repeated.

An easy way of doing this is to obtain a list of integers and then shuffle it.

For example:

// Obtain a list of integers from 0 to the size of population - 1
final List<Integer> integers = Stream.iterate(0, n -> n + 1)
    .limit(population)
    .collect(Collectors.toList());
// integers will have have [0, 1, 2, .... n]

// Then shuffle them
Collections.shuffle(integers);
// integers will have have something like [3, 66, 44, .... n] randomly

Check https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#shuffle-java.util.List-java.util.Random-

Upvotes: 1

Related Questions