How to randomly change positions of non-zero entries of an array where certain rows are excluded

Question

I have a numpy array consisting of a lot of 0s and a few non-zero entries e.g. like this (just a toy example):

myArray = np.array([[ 0.       ,  0.       ,  0.79],
       [ 0.       ,  0.       ,  0.       ],
       [ 0.       ,  0.       ,  0.       ],
       [ 0.       ,  0.435    ,  0.       ]])

Now I would like to move each of the non-zero entries with a given probability which means that some of the entries are moved, some might remain at the current position. Some of the rows are not allowed to contain a non-zero entry which means that values are not allowed to be moved there. I implemented that as follows:

import numpy as np

# for reproducibility
np.random.seed(2)

myArray = np.array([[ 0.       ,  0.       ,  0.79],
       [ 0.       ,  0.       ,  0.       ],
       [ 0.       ,  0.       ,  0.       ],
       [ 0.       ,  0.435    ,  0.       ]])

# list of rows where numbers are not allowed to be moved to   
ignoreRows = [2]

# moving probability
probMove =  0.3

# get non-zero entries
nzEntries = np.nonzero(myArray) 

# indices of the non-zero entries as tuples
indNZ = zip(nzEntries[0], nzEntries[1]) 

# store values
valNZ = [myArray[i] for i in indNZ] 

# generating probabilities for moving for each non-zero entry
lProb = np.random.rand(len(nzEntries)) 

allowedRows = [ind for ind in xrange(myArray.shape[0]) if ind not in ignoreRows]  # replace by "range" in python 3.x
allowedCols = [ind for ind in xrange(myArray.shape[1])]  # replace by "range" in python 3.x

for indProb, prob in enumerate(lProb):
    # only move with a certain probability
    if prob <= probMove:
        # randomly change position
        myArray[np.random.choice(allowedRows), np.random.choice(allowedCols)] = valNZ[indProb]

        # set old position to zero
        myArray[indNZ[indProb]] = 0.

print myArray

First, I determine all the indices and values of the non-zero entries. Then I assign a certain probability to each of these entries which determines whether the entry will be moved. Then I get the allowed target rows.

In the second step, I loop through the list of indices and move them according to their moving probability which is done by choosing from the allowed rows and columns, assigning the respective value to these new indices and set the "old" value to 0.

It works fine with the code above, however, speed really matters in this case and I wonder whether there is a more efficient way of doing this.

EDIT: Hpaulj's answer helped me to get rid of the for-loop which is nice and the reason why I accepted his answer. I incorporated his comments and posted an answer below as well, just in case someone else stumbles over this example and wonders how I used his answer in the end.

hpaulj · Accepted Answer

You can index elements with arrays, so:

valNZ=myArray[nzEntries]

can replace the zip and comprehension.

Simplify these 2 assignments:

allowedCols=np.arange(myArray.shape[1]);
allowedRows=np.delete(np.arange(myArray.shape[0]), ignoreRows)

With:

I=lProb



you don't need to perform the prog test each time in the loop; just iterate over valNZ and indNZ.


I think your random.choice can be generated for all of these valNZ at once:

np.random.choice(np.arange(10), 10, True)  
# 10 choices from the range with replacement


With that it should be possible to move all of the points without a loop.

I haven't worked out the details yet.

There is one way in which your iterative move will be different from any parallel one.  If a destination choice is another value, the iterative approach can over write, and possibly move a given value a couple of times.  Parallel code will not perform the sequential moves.  You have to decide whether one is correct or not.

There is a ufunc method, .at, that performs unbuffered operations.  It works for operations like add, but I don't know if would apply to an indexing move like this.



simplified version of the iterative moving:

In [106]: arr=np.arange(20).reshape(4,5)
In [107]: I=np.nonzero(arr>10)
In [108]: v=arr[I]
In [109]: rows,cols=np.arange(4),np.arange(5)

In [110]: for i in range(len(v)):
    dest=(np.random.choice(rows),np.random.choice(cols))
    arr[dest]=v[i]
    arr[I[0][i],I[1][i]] = 0

In [111]: arr
Out[111]: 
array([[ 0, 18,  2, 14, 11],
       [ 5, 16,  7, 13, 19],
       [10,  0,  0,  0,  0],
       [ 0, 17,  0,  0,  0]])


possible vectorized version:

In [117]: dest=(np.random.choice(rows,len(v),True),np.random.choice(cols,len(v),True)) 
In [118]: dest
Out[118]: (array([1, 1, 3, 1, 3, 2, 3, 0, 0]), array([3, 0, 0, 1, 2, 3, 4, 0, 1]))

In [119]: arr[dest]
Out[119]: array([ 8,  5, 15,  6, 17, 13, 19,  0,  1])
In [120]: arr[I]=0
In [121]: arr[dest]=v

In [122]: arr
Out[122]: 
array([[18, 19,  2,  3,  4],
       [12, 14,  7, 11,  9],
       [10,  0,  0, 16,  0],
       [13,  0, 15,  0, 17]])


If I sets 0 after, there are more zeros.

In [124]: arr[dest]=v    
In [125]: arr[I]=0
In [126]: arr
Out[126]: 
array([[18, 19,  2,  3,  4],
       [12, 14,  7, 11,  9],
       [10,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0]])


same dest, but done iteratively:

In [129]: for i in range(len(v)):
   .....:     arr[dest[0][i],dest[1][i]] = v[i]
   .....:     arr[I[0][i],I[1][i]] = 0

In [130]: arr
Out[130]: 
array([[18, 19,  2,  3,  4],
       [12, 14,  7, 11,  9],
       [10,  0,  0, 16,  0],
       [ 0,  0,  0,  0,  0]])


With this small size, and high moving density, the differences between iterative and vectorized solutions are large.  For a sparse array they would be fewer.

How to randomly change positions of non-zero entries of an array where certain rows are excluded

Answers (2)

Related Questions