delimiter
delimiter

Reputation: 795

Vector generated numpy column with values under condition

Suppose I have a numpy array generated like so:

np.random.seed(1)
arr = np.random.randint(10,size=(5,2))

which produces the following:

array([[5, 8],
       [9, 5],
       [0, 0],
       [1, 7],
       [6, 9]])

How do I:

The following would be illegal because as we see in the first row, the third column is 8 where the second column is also 8:

np.append(arr, np.random.randint(10,size=(5,1)), axis=1)

array([[5, 8, 8],
   [9, 5, 8],
   [0, 0, 6],
   [1, 7, 2],
   [6, 9, 8]])

A sub-question:

I understand that this can be done using standard for loops, but this would dramatically decrease the performance if we are talking about millions of rows, so I am looking for a vectorized solution.

Upvotes: 1

Views: 44

Answers (1)

kuzand
kuzand

Reputation: 9806

Here's one way to do it;

import random

def my_randint(a, select_columns, n):
    integers = list(set(range(n)).difference(a[select_columns]))
    return random.choice(integers)

new_col = np.apply_along_axis(my_randint, axis=1, arr=arr, select_columns=[0, 1], n=10) 
new_arr = np.hstack([arr, new_col[:,None]])

Note that I use random.choice instead of np.random.choice because it is faster.

Upvotes: 2

Related Questions