Reputation: 17631
I would like to use numpy.random.choice()
but make sure that the draws are spaced by at least a certain "interval":
As a concrete example,
import numpy as np
np.random.seed(123)
interval = 5
foo = np.random.choice(np.arange(1,50), 5) ## 5 random draws between array([ 1, 2, ..., 50])
print(foo)
## array([46, 3, 29, 35, 39])
I would prefer these be spaced by at least the interval+1
, i.e. 5+1=6. In the above example, this condition isn't met: there should be another random draw, as 35 and 39 are separated by 4, which is less than 6.
The array array([46, 3, 29, 15, 39])
would be ok, as all draws are spaced by at least 6.
numpy.random.choice(array, size)
draws size
number of draws in array
. Is there another function used to check the "spacing" between elements in a numpy array? I could write the above with an if/while statement, but I'm not sure how to most efficiently check the spacing of elements in a numpy array.
Upvotes: 5
Views: 554
Reputation:
You can sort the array first to have all points in an ascending order, then use np.diff
to find the difference between consecutive values. If any difference is smaller than the interval
, then the condition has not been met. i.e.
import numpy as np
interval = 5
foo = np.random.choice(np.arange(1,50),5)
while np.any(np.diff(np.sort(foo)) <= interval):
foo = np.random.choice(np.arange(1,50),5)
print(foo)
Which would loop until you get a numpy array where all values differ by atleast the interval
.
Upvotes: 2
Reputation: 53029
Here is a solution that inserts the spaces after drawing:
def spaced_choice(low, high, delta, n_samples):
draw = np.random.choice(high-low-(n_samples-1)*delta, n_samples, replace=False)
idx = np.argsort(draw)
draw[idx] += np.arange(low, low + delta*n_samples, delta)
return draw
Sample run:
spaced_choice(4, 20, 3, 4)
# array([ 5, 9, 19, 13])
spaced_choice(1, 50, 5, 5)
# array([30, 8, 1, 15, 43])
Please note that a draw and then accept-or-reject-and-redraw strategy can be very expensive. In the worst-case example below redrawing takes almost half a minute for just 10
samples because the accpet/reject ratio is very poor. The insert-the-spaces-afterwards method has no problems of this kind.
Time required by different methods for two examples:
low, high, delta, size = 1, 100, 5, 5
add_spaces 0.04245870 ms
redraw 0.11335560 ms
low, high, delta, size = 1, 20, 1, 10
add_spaces 0.03201030 ms
redraw 27881.01527220 ms
Code:
import numpy as np
import types
from timeit import timeit
def f_add_spaces(low, high, delta, n_samples):
draw = np.random.choice(high-low-(n_samples-1)*delta, n_samples, replace=False)
idx = np.argsort(draw)
draw[idx] += np.arange(low, low + delta*n_samples, delta)
return draw
def f_redraw(low, high, delta, n_samples):
foo = np.random.choice(np.arange(low, high), n_samples)
while any(x <= delta for x in np.diff(np.sort(foo))):
foo = np.random.choice(np.arange(low, high), n_samples)
return foo
for l, h, k, n in [(1, 100, 5, 5), (1, 20, 1, 10)]:
print(f'low, high, delta, size = {l}, {h}, {k}, {n}')
for name, func in list(globals().items()):
if not name.startswith('f_') or not isinstance(func, types.FunctionType):
continue
print("{:16s}{:16.8f} ms".format(name[2:], timeit(
'f(*args)', globals={'f':func, 'args':(l,h,k,n)}, number=10)*100))
Upvotes: 3