Michael
Michael

Reputation: 42050

Find random numbers in a given range with certain possible numbers excluded

Suppose you are given a range and a few numbers in the range (exceptions). Now you need to generate a random number in the range except the given exceptions.

For example, if range = [1..5] and exceptions = {1, 3, 5} you should generate either 2 or 4 with equal probability.

What logic should I use to solve this problem?

Upvotes: 1

Views: 671

Answers (7)

Piyush
Piyush

Reputation: 321

Suppose the initial range is [1,n] and and exclusion set's size is x. First generate a map from [1, n-x] to the numbers [1,n] excluding the numbers in the exclusion set. This mapping with 1-1 since there are equal numbers on both sides. In the example given in the question the mapping with be as follows - {1->2,2->4}.

Another example suppose the list is [1,10] and the exclusion list is [2,5,8,9] then the mapping is {1->1, 2->3, 3->4, 4->6, 5->7, 6->10}. This map can be created in a worst case time complexity of O(nlogn).

Now generate a random number between [1, n-x] and map it to the corresponding number using the mapping. Map looks can be done in O(logn).

Upvotes: 1

Xantix
Xantix

Reputation: 3331

Here's another approach...just keep on generating random numbers until you get one that isn't excluded.

Suppose your desired range was [0,100) excluding 25,50, and 75.

Put the excluded values in a hashtable or bitarray for fast lookup.

int randNum = rand(0,100);

while( excludedValues.contains(randNum) )
{
    randNum = rand(0,100);
}

The complexity analysis is more difficult, since potentially rand(0,100) could return 25, 50, or 75 every time. However that is quite unlikely (assuming a random number generator), even if half of the range is excluded.

In the above case, we re-generate a random value for only 3/100 of the original values.

So 3% of the time you regenerate once. Of those 3%, only 3% will need to be regenerated, etc.

Upvotes: 1

Sklivvz
Sklivvz

Reputation: 31133

You can do it in a versatile way if you have enumerators or set operations. For example using Linq:

void Main()
{
    var exceptions = new[] { 1,3,5 };
    RandomSequence(1,5).Where(n=>!exceptions.Contains(n))
                       .Take(10)
                       .Select(Console.WriteLine);
}

static Random r = new Random();

IEnumerable<int> RandomSequence(int min, int max)
{
    yield return r.Next(min, max+1);
}

I would like to acknowledge some comments that are now deleted:

  • It's possible that this program never ends (only theoretically) because there could be a sequence that never contains valid values. Fair point. I think this is something that could be explained to the interviewer, however I believe my example is good enough for the context.

  • The distribution is fair because each of the elements has the same chance of coming up.

  • The advantage of answering this way is that you show understanding of modern "functional-style" programming, which may be interesting to the interviewer.

  • The other answers are also correct. This is a different take on the problem.

Upvotes: 0

Dhruv Kapoor
Dhruv Kapoor

Reputation: 1081

Let's assume, to keep things simple, that arrays are indexed starting at 1, and your range runs from 1 to k. Of course, you can always shift the result by a constant if this is not the case. We'll call the array of exceptions ex_array, and let's say we have c exceptions. These need to be sorted, which shall turn out to be pretty important in a while.

Now, you only have k-e useful numbers to work with, so it'll be meaningful to find a random number in the range 1 to k-e. Say we end up with the number r. Now, we just need to find the r-th valid number in your array. Simple? Not so much. Remember, you can never simply walk over any of your arrays in a linear fashion, because that can really slow down your implementation when you have a lot of numbers. You have do some sort of binary search, say, to come up with a fast enough algorithm.

So let's try something better. The r-th number would nominally have lied at index r in your original array had you had no exceptions. The number at index r is r, of course, since your range and your array indices start from 1. But, you have a bunch of invalid numbers between 1 and r, and you want to somehow get to the r-th valid number. So, lets do a binary search on the array of exceptions, ex_array, to find how many invalid numbers are equal to or less than r, because we have these many invalid numbers lying between 1 and r. If this number is 0, we're all done, but if it isn't, we have a bit more work to do.

Assume you found there were n invalid numbers between 1 and r after the binary search. Let's advance n indices in your array to the index r+n, and find the number of invalid numbers lying between 1 and r+n, using a binary search to find how many elements in ex_array are less than or equal to r+n. If this number is exactly n, no more invalid numbers were encountered, and you've hit upon your r-th valid number. Otherwise, repeat again, this time for the index r+n', where n' is the number of random numbers that lay between 1 and r+n.

Repeat till you get to a stage where no excess exceptions are found. The important thing here is that you never once have to walk over any of the arrays in a linear fashion. You should optimize the binary searches so they don't always start at index 0. Say if you know there are n random numbers between 1 and r. Instead of starting your next binary search from 1, you could start it from one index after the index corresponding to n in ex_array.

In the worst case, you'll be doing binary searches for each element in ex_array, which means you'll do c binary searches, the first starting from index 1, the next from index 2, and so on, which gives you a time complexity of O(log(n!)). Now, Stirling's approximation tells us that O(ln(x!)) = O(xln(x)), so using the algorithm above only makes sense if c is small enough that O(cln(c)) < O(k), since you can achieve O(k) complexity using the trivial method of extracting valid elements from your array first.

Upvotes: 2

Roman Susi
Roman Susi

Reputation: 4199

In Python the solution is very simple (given your example):

import random 
rng = set(range(1, 6))
ex = {1, 3, 5}
random.choice(list(rng-ex))

To optimize the solution, one needs to know how long is the range and how many exceptions there are. If the number of exceptions is very low, it's possible to generate a number from the range and just check if it's not an exception. If the number of exceptions is dominant, it probably makes sense to gather the remaining numbers into an array and generate random index for fetching non-exception.

In this answer I assume that it is known how to get an integer random number from a range.

Upvotes: 2

Save
Save

Reputation: 11938

If you have no constraints at all, i guess this is the easiest way: create an array containing the valid values, a[0]...a[m] . Return a[rand(0,...,m)].

If you don't want to create an auxiliary array, but you can count the number of exceptions e and of elements n in the original range, you can simply generate a random number r=rand(0 ... n-e), and then find the valid element with a counter that doesn't tick on exceptions, and stops when it's equal to r.

Upvotes: 5

pjs
pjs

Reputation: 19855

Depends on the specifics of the case. For your specific example, I'd return a 2 if a Uniform(0,1) was below 1/2, 4 otherwise. Similarly, if I saw a pattern such as "the exceptions are odd numbers", I'd generate values for half the range and double. In general, though, I'd generate numbers in the range, check if they're in the exception set, and reject and re-try if they were - a technique known as acceptance/rejection for obvious reasons. There are a variety of techniques to make the exception-list check efficient, depending on how big it is and what patterns it may have.

Upvotes: 4

Related Questions