Aenaon
Aenaon

Reputation: 3573

Sequential random choice in Python

I want to randomly choose from an array but the requirement is that the elements of the output array will increase by one (and start at zero). For example, if I want to get 5 numbers between 0 and 5 then one could do

np.random.choice(np.arange(6), 5)
array([5, 0, 5, 2, 5])

where, in this case, I would like this to be:

array([2, 0, 2, 1, 2])

Another example, if

np.random.choice(np.arange(6), 5)
array([1, 1, 1, 4, 2])

I am trying to "rebase" this in such a manner that it will be

array([0, 0, 0, 2, 1])

Final example...select 15 numbers between 0 and 5

np.random.choice(np.arange(6), 15)
array([4, 5, 3, 0, 4, 5, 3, 0, 2, 5, 2, 3, 2, 4, 4])

where eventually I want to end up with

array([3, 4, 2, 0, 3, 4, 2, 0, 1, 4, 1, 2, 1, 3, 3])

Upvotes: 0

Views: 2113

Answers (3)

Paul Panzer
Paul Panzer

Reputation: 53029

If your original sequence contains only unique elements, sorting based approaches like np.unique are actually a bit wasteful at O(n log n) since an O(n) solution is available (assuming n >= k where k is the size of the set to choose from):

>>> import numpy as np
>>>
to_choose_from = [1, 5, 7, 9, 10, 'hello', ()]
>>> n = 12
>>> 
>>> k = len(to_choose_from)
# make sure no duplicates - skip this if you happen to know
>>> assert len(set(to_choose_from)) == k
>>> 
>>> chc = np.random.randint(0, k, (n,))
>>> chc
array([4, 4, 1, 5, 3, 1, 5, 5, 6, 1, 6, 6])
>>> 
>>> occur = np.zeros((k,), int)
>>> occur[chc] = 1
>>> idx, = np.where(occur)
>>> occur[idx] = np.arange(idx.size)
>>> result = occur[chc]
>>> result
array([2, 2, 0, 3, 1, 0, 3, 3, 4, 0, 4, 4])

Upvotes: 0

Mark Dickinson
Mark Dickinson

Reputation: 30561

What you're looking to do is replace each entry in your original array x by its index in the array of unique elements of x (in sorted order). For example, if x is np.array([7, 6, 2, 7, 7, 2]), the unique elements of x are [2, 6, 7], and we want to replace each number with its position in that array of unique elements: that is, replace each 2 with 0, each 6 with 1 and each 7 with 2.

The numpy.unique function does both these jobs: it finds the (sorted) array of unique elements for you, and if you pass return_inverse=True, np.unique will also give you a second return value that contains exactly the indices you're after. So all you need to do is call np.unique with return_inverse=True, throw away the first return value, and keep the second. Examples:

>>> import numpy as np
>>> np.unique([5, 0, 5, 2, 5], return_inverse=True)[1]
array([2, 0, 2, 1, 2])
>>> x = np.array([4, 5, 3, 0, 4, 5, 3, 0, 2, 5, 2, 3, 2, 4, 4])
>>> np.unique(x, return_inverse=True)[1]
array([3, 4, 2, 0, 3, 4, 2, 0, 1, 4, 1, 2, 1, 3, 3])

Upvotes: 4

6502
6502

Reputation: 114481

What you could do is starting from a randomly chosen array

x = np.random.choice(np.arange(6), 5)

then collect the unique values and sorting them

v = sorted(set(x))

then map the original value to the index in v:

result = [v.index(y) for y in x]

Upvotes: 2

Related Questions