Maxime Chéramy
Maxime Chéramy

Reputation: 18821

Shuffle columns of an array with Numpy

Let's say I have an array r of dimension (n, m). I would like to shuffle the columns of that array.

If I use numpy.random.shuffle(r) it shuffles the lines. How can I only shuffle the columns? So that the first column become the second one and the third the first, etc, randomly.

Example:

input:

array([[  1,  20, 100],
       [  2,  31, 401],
       [  8,  11, 108]])

output:

array([[  20, 1, 100],
       [  31, 2, 401],
       [  11,  8, 108]])

Upvotes: 26

Views: 31694

Answers (7)

isarandi
isarandi

Reputation: 3349

numpy.random.Generator.shuffle has an axis parameter. This will shuffle in place:

rng = np.random.default_rng()
rng.shuffle(arr, axis=1)

https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.shuffle.html

Upvotes: 0

Maxime Chéramy
Maxime Chéramy

Reputation: 18821

One approach is to shuffle the transposed array:

 np.random.shuffle(np.transpose(r))

Another approach (see YXD's answer https://stackoverflow.com/a/20546567/1787973) is to generate a list of permutations to retrieve the columns in that order:

 r = r[:, np.random.permutation(r.shape[1])]

Performance-wise, the second approach is faster.

Upvotes: 29

Ataxias
Ataxias

Reputation: 1193

There is another way, which does not use transposition and is apparently faster:

np.take(r, np.random.permutation(r.shape[1]), axis=1, out=r)

CPU times: user 1.14 ms, sys: 1.03 ms, total: 2.17 ms. Wall time: 3.89 ms

The approach in other answers: np.random.shuffle(r.T)

CPU times: user 2.24 ms, sys: 0 ns, total: 2.24 ms Wall time: 5.08 ms

I used r = np.arange(64*1000).reshape(64, 1000) as an input.

Upvotes: 2

Sandip Saha
Sandip Saha

Reputation: 21

>>> print(s0)
>>> [[0. 1. 0. 1.]
     [0. 1. 0. 0.]
     [0. 1. 0. 1.]
     [0. 0. 0. 1.]]
>>> print(np.random.permutation(s0.T).T)
>>> [[1. 0. 1. 0.]
     [0. 0. 1. 0.]
     [1. 0. 1. 0.]
     [1. 0. 0. 0.]]

np.random.permutation(), does the row permutation.

Upvotes: 2

Matthew
Matthew

Reputation: 217

So, one step further from your answer:

Edit: I very easily could be mistaken how this is working, so I'm inserting my understanding of the state of the matrix at each step.

r == 1 2 3
     4 5 6
     6 7 8

r = np.transpose(r)  

r == 1 4 6
     2 5 7
     3 6 8           # Columns are now rows

np.random.shuffle(r)

r == 2 5 7
     3 6 8 
     1 4 6           # Columns-as-rows are shuffled

r = np.transpose(r)  

r == 2 3 1
     5 6 4
     7 8 6           # Columns are columns again, shuffled.

which would then be back in the proper shape, with the columns rearranged.

The transpose of the transpose of a matrix == that matrix, or, [A^T]^T == A. So, you'd need to do a second transpose after the shuffle (because a transpose is not a shuffle) in order for it to be in its proper shape again.

Edit: The OP's answer skips storing the transpositions and instead lets the shuffle operate on r as if it were.

Upvotes: 5

patapouf_ai
patapouf_ai

Reputation: 18693

In general if you want to shuffle a numpy array along axis i:

def shuffle(x, axis = 0):
    n_axis = len(x.shape)
    t = np.arange(n_axis)
    t[0] = axis
    t[axis] = 0
    xt = np.transpose(x.copy(), t)
    np.random.shuffle(xt)
    shuffled_x = np.transpose(xt, t)
    return shuffled_x

shuffle(array, axis=i)

Upvotes: 2

YXD
YXD

Reputation: 32511

For a general axis you could follow the pattern:

>>> import numpy as np
>>> 
>>> a = np.array([[  1,  20, 100, 4],
...               [  2,  31, 401, 5],
...               [  8,  11, 108, 6]])
>>> 
>>> print a[:, np.random.permutation(a.shape[1])]
[[  4   1  20 100]
 [  5   2  31 401]
 [  6   8  11 108]]
>>> 
>>> print a[np.random.permutation(a.shape[0]), :]
[[  1  20 100   4]
 [  2  31 401   5]
 [  8  11 108   6]]
>>> 

Upvotes: 6

Related Questions