Drawing random numbers with draws in some pre-defined interval, `numpy.random.choice()`

Question

I would like to use numpy.random.choice() but make sure that the draws are spaced by at least a certain "interval":

As a concrete example,

import numpy as np
np.random.seed(123)
interval = 5
foo = np.random.choice(np.arange(1,50), 5)  ## 5 random draws between array([ 1,  2, ..., 50])
print(foo)
## array([46,  3, 29, 35, 39])

I would prefer these be spaced by at least the interval+1, i.e. 5+1=6. In the above example, this condition isn't met: there should be another random draw, as 35 and 39 are separated by 4, which is less than 6.

The array array([46, 3, 29, 15, 39]) would be ok, as all draws are spaced by at least 6.

numpy.random.choice(array, size) draws size number of draws in array. Is there another function used to check the "spacing" between elements in a numpy array? I could write the above with an if/while statement, but I'm not sure how to most efficiently check the spacing of elements in a numpy array.

Paul Panzer · Accepted Answer

Here is a solution that inserts the spaces after drawing:

def spaced_choice(low, high, delta, n_samples):
    draw = np.random.choice(high-low-(n_samples-1)*delta, n_samples, replace=False)
    idx = np.argsort(draw)
    draw[idx] += np.arange(low, low + delta*n_samples, delta)
    return draw

Sample run:

spaced_choice(4, 20, 3, 4)
# array([ 5,  9, 19, 13])
spaced_choice(1, 50, 5, 5)
# array([30,  8,  1, 15, 43])

Please note that a draw and then accept-or-reject-and-redraw strategy can be very expensive. In the worst-case example below redrawing takes almost half a minute for just 10 samples because the accpet/reject ratio is very poor. The insert-the-spaces-afterwards method has no problems of this kind.

Time required by different methods for two examples:

low, high, delta, size = 1, 100, 5, 5
add_spaces            0.04245870 ms
redraw                0.11335560 ms
low, high, delta, size = 1, 20, 1, 10
add_spaces            0.03201030 ms
redraw            27881.01527220 ms

Code:

import numpy as np

import types
from timeit import timeit

def f_add_spaces(low, high, delta, n_samples):
    draw = np.random.choice(high-low-(n_samples-1)*delta, n_samples, replace=False)
    idx = np.argsort(draw)
    draw[idx] += np.arange(low, low + delta*n_samples, delta)
    return draw

def f_redraw(low, high, delta, n_samples):
    foo = np.random.choice(np.arange(low, high), n_samples)
    while any(x <= delta for x in np.diff(np.sort(foo))):
        foo = np.random.choice(np.arange(low, high), n_samples)
    return foo

for l, h, k, n in [(1, 100, 5, 5), (1, 20, 1, 10)]:
    print(f'low, high, delta, size = {l}, {h}, {k}, {n}')
    for name, func in list(globals().items()):
        if not name.startswith('f_') or not isinstance(func, types.FunctionType):
            continue
        print("{:16s}{:16.8f} ms".format(name[2:], timeit(
                'f(*args)', globals={'f':func, 'args':(l,h,k,n)}, number=10)*100))

Drawing random numbers with draws in some pre-defined interval, `numpy.random.choice()`

Answers (2)

Related Questions