Reputation: 3573
I want to randomly choose from an array but the requirement is that the elements of the output array will increase by one (and start at zero). For example, if I want to get 5 numbers between 0 and 5 then one could do
np.random.choice(np.arange(6), 5)
array([5, 0, 5, 2, 5])
where, in this case, I would like this to be:
array([2, 0, 2, 1, 2])
Another example, if
np.random.choice(np.arange(6), 5)
array([1, 1, 1, 4, 2])
I am trying to "rebase" this in such a manner that it will be
array([0, 0, 0, 2, 1])
Final example...select 15 numbers between 0 and 5
np.random.choice(np.arange(6), 15)
array([4, 5, 3, 0, 4, 5, 3, 0, 2, 5, 2, 3, 2, 4, 4])
where eventually I want to end up with
array([3, 4, 2, 0, 3, 4, 2, 0, 1, 4, 1, 2, 1, 3, 3])
Upvotes: 0
Views: 2113
Reputation: 53029
If your original sequence contains only unique elements, sorting based approaches like np.unique
are actually a bit wasteful at O(n log n) since an O(n) solution is available (assuming n >= k where k is the size of the set to choose from):
>>> import numpy as np
>>>
to_choose_from = [1, 5, 7, 9, 10, 'hello', ()]
>>> n = 12
>>>
>>> k = len(to_choose_from)
# make sure no duplicates - skip this if you happen to know
>>> assert len(set(to_choose_from)) == k
>>>
>>> chc = np.random.randint(0, k, (n,))
>>> chc
array([4, 4, 1, 5, 3, 1, 5, 5, 6, 1, 6, 6])
>>>
>>> occur = np.zeros((k,), int)
>>> occur[chc] = 1
>>> idx, = np.where(occur)
>>> occur[idx] = np.arange(idx.size)
>>> result = occur[chc]
>>> result
array([2, 2, 0, 3, 1, 0, 3, 3, 4, 0, 4, 4])
Upvotes: 0
Reputation: 30561
What you're looking to do is replace each entry in your original array x
by its index in the array of unique elements of x
(in sorted order). For example, if x
is np.array([7, 6, 2, 7, 7, 2])
, the unique elements of x
are [2, 6, 7]
, and we want to replace each number with its position in that array of unique elements: that is, replace each 2
with 0
, each 6
with 1
and each 7
with 2
.
The numpy.unique
function does both these jobs: it finds the (sorted) array of unique elements for you, and if you pass return_inverse=True
, np.unique
will also give you a second return value that contains exactly the indices you're after. So all you need to do is call np.unique
with return_inverse=True
, throw away the first return value, and keep the second. Examples:
>>> import numpy as np
>>> np.unique([5, 0, 5, 2, 5], return_inverse=True)[1]
array([2, 0, 2, 1, 2])
>>> x = np.array([4, 5, 3, 0, 4, 5, 3, 0, 2, 5, 2, 3, 2, 4, 4])
>>> np.unique(x, return_inverse=True)[1]
array([3, 4, 2, 0, 3, 4, 2, 0, 1, 4, 1, 2, 1, 3, 3])
Upvotes: 4
Reputation: 114481
What you could do is starting from a randomly chosen array
x = np.random.choice(np.arange(6), 5)
then collect the unique values and sorting them
v = sorted(set(x))
then map the original value to the index in v
:
result = [v.index(y) for y in x]
Upvotes: 2