Ouroboroski
Ouroboroski

Reputation: 171

How can I permute only certain entries of numpy 2d-array?

I have a numpy 2d-array of shape (N, N) representing a NxN correlation matrix. This matrix is symmetric. Say N=5, then an example of this 2d-array would be:

x = np.array([[1.00, 0.46, 0.89, 0.76, 0.65],
              [0.46, 1.00, 0.83, 0.88, 0.29],
              [0.89, 0.83, 1.00, 0.57, 0.84],
              [0.76, 0.88, 0.57, 1.00, 0.39],
              [0.65, 0.29, 0.84, 0.39, 1.00]])

I would like to obtain P copies of x where the diagonal remains the same but the upper- and lower-triangular halves of the matrix are permuted in unison.

An example of one of these copies could be:

np.array([[1.00, 0.65, 0.89, 0.84, 0.39],
          [0.65, 1.00, 0.76, 0.83, 0.88],
          [0.89, 0.76, 1.00, 0.29, 0.57],
          [0.84, 0.83, 0.29, 1.00, 0.46],
          [0.39, 0.88, 0.57, 0.46, 1.00]])

It would be great if the solution doesn't take too long as the matrix I am using is of shape (100, 100) and I would like to obtain 10,000-100,000 copies.

My intuition would be to somehow obtain the lower or upper half of the matrix as a flattened array, do the permutation, and replace values in both upper and lower halves. This, however, would take me a while to figure out and would like to know if there is a more straight-forward approach. Thanks.

Upvotes: 1

Views: 225

Answers (1)

swag2198
swag2198

Reputation: 2696

You can try this:

import numpy as np

x = np.array([[1.00, 0.46, 0.89, 0.76, 0.65],
              [0.46, 1.00, 0.83, 0.88, 0.29],
              [0.89, 0.83, 1.00, 0.57, 0.84],
              [0.76, 0.88, 0.57, 1.00, 0.39],
              [0.65, 0.29, 0.84, 0.39, 1.00]])
j, i = np.meshgrid(np.arange(x.shape[0]), np.arange(x.shape[0]))
i, j = i.flatten(), j.flatten()
up_i, up_j = i[i < j], j[i< j]

elems = x[up_i, up_j]
np.random.shuffle(elems)
x[up_i, up_j] = elems
x[up_j, up_i] = elems
x
array([[1.  , 0.57, 0.88, 0.46, 0.76],
       [0.57, 1.  , 0.39, 0.29, 0.83],
       [0.88, 0.39, 1.  , 0.65, 0.89],
       [0.46, 0.29, 0.65, 1.  , 0.84],
       [0.76, 0.83, 0.89, 0.84, 1.  ]])

In case all your xs are of same shape, you need to call meshgrid and find indices corresponding to upper triangle only once.

This uses numpy fancy indexing to fetch the non-diagonal elements.

Upvotes: 1

Related Questions