minimize runtime for numpy array manipulation

Question

I have an 2 dimensional array with np.shape(input)=(a,b) and that looks like

input=array[array_1[0,0,0,1,0,1,2,0,3,3,2,...,entry_b],...array_a[1,0,0,1,2,2,0,3,1,3,3,...,entry_b]]

Now I want to create an array np.shape(output)=(a,b,b) in which every entry that had the same value in the input get the value 1 and 0 otherwise

for example:

input=[[1,0,0,0,1,2]]

output=[array([[1., 0., 0., 0., 1., 0.],
               [0., 1., 1., 1., 0., 0.],
               [0., 1., 1., 1., 0., 0.],
               [0., 1., 1., 1., 0., 0.],
               [1., 0., 0., 0., 1., 0.],
               [0., 0., 0., 0., 0., 1.]])]

My code so far is looking like:

def get_matrix(svdata,padding_size):
List=[]
for k in svdata:
    matrix=np.zeros((padding_size,padding_size))
    for l in range(padding_size):
        for m in range(padding_size):
            if k[l]==k[m]:
                matrix[l][m]=1
    List.append(matrix)
return List

But it takes 2:30 min for an input array of shape (2000,256). How can I become more effiecient by using built in numpy solutions?

Frank Yellin · Accepted Answer

You're trying to create the array y where y[i,j,k] is 1 if input[i,j] == input[i, k]. At least that's what I think you're trying to do.

So y = input[:,:,None] == input[:,None,:] will give you a boolean array. You can then convert that to np.dtype('float64') using astype(...) if you want.

minimize runtime for numpy array manipulation

Answers (2)

Related Questions