Reputation: 12407
I have a 2D numpy array of 0s and 1s.
a = np.array([[1, 0, 0, 1], [0, 1, 0, 1]])
What I need is to create a new array a_new
according to this:
For each 1 in location [l, k]
of array a
, pick a random number according to a desired distribution (e.g. shift = np.int64(np.ceil(np.random.gamma(1, 3)))
) and put it at a_new[l, k, shift]
if shift is smaller than N
, otherwise ignore that 1.
Here is a loop implantation of it. Is there a faster (maybe array operation) solution to this. The matrix a's size is large.
import numpy as np
N = 5
a = np.array([[1, 0, 0, 1], [0, 1, 0, 1]])
a_new = np.zeros((a.shape[0], a.shape[1], N))
for k in np.arange(a.shape[1]):
for l in np.arange(a.shape[0]):
if a[l, k]:
shift = np.int64(np.ceil(np.random.gamma(1, 3)))
if (shift < N):
a_new[l, k, shift] = 1
output sample:
a
[[1 0 0 1]
[0 1 0 1]]
a_new
[[[0. 0. 1. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 1. 0. 0.]]
[[0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]]
Upvotes: 2
Views: 220
Reputation: 12407
I found the solution with advanced indexing in numpy that is significantly faster. Basically we can use non zero elements of a
as indices to first two dimensions of a_new
and the random values of shift
as the indices to the third dimension of a_new
with some filtering beforehand (remove out of bound random numbers, and reshaping shift
array to have same shape as nonzero sub-array of a
. Here is the working code:
import numpy as np
N = 5
a = np.array([[1, 0, 0, 1], [0, 1, 0, 1]])
a_new = np.zeros((a.shape[0], a.shape[1], N))
shift = np.int64(np.ceil(np.random.gamma(1, 3, a.shape)))
a[shift > N-1] = 0
shift[shift > N-1] = 0
shift = (shift * a).reshape(1, -1)
shift = shift[shift > 0]
non_zero_a = np.nonzero(a)
a_new[non_zero_a[0], non_zero_a[1], shift] = 1
Upvotes: 0
Reputation: 53029
Here is a trick using np.bincount
:
a = np.array([[1, 0, 0, 1], [0, 1, 0, 1]])
N = 5
X,Y = a.nonzero()
Z = np.ceil(np.random.gamma(1,3,X.shape)).astype(int)
Z
# array([ 2, 1, 15, 2])
flat = np.ravel_multi_index((X[Z<N],Y[Z<N],Z[Z<N]),a.shape+(N,))
np.bincount(flat,None,a.size*N).reshape(*a.shape,N)
# array([[[0, 0, 1, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 1, 0, 0, 0]],
#
# [[0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 1, 0, 0]]])
UPDATE: With multiplicities:
a = np.array([[1, 0, 0, 3], [0, 10, 0, 1]])
N = 5
X,Y = a.nonzero()
times = a[X,Y]
X = X.repeat(times)
Y = Y.repeat(times)
Z = np.ceil(np.random.gamma(1,3,X.shape)).astype(int)
flat = np.ravel_multi_index((X[Z<N],Y[Z<N],Z[Z<N]),a.shape+(N,))
np.bincount(flat,None,a.size*N).reshape(*a.shape,N)
# array([[[0, 1, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0],
# [0, 1, 0, 0, 0]],
#
# [[0, 0, 0, 0, 0],
# [0, 4, 2, 0, 3],
# [0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0]]])
Upvotes: 1
Reputation: 224
You could also try the following which may help,
import numpy as np
N = 5
a = np.array([[1, 0, 0, 1], [0, 1, 0, 1]])
row, col = np.where(a==1)
a_new = np.zeros([a.shape[0], a.shape[1], N])
a_new[row,col] = (np.int64(np.ceil(np.random.gamma(1,3,[a.shape[1],N])))<N)
Upvotes: 0