Reputation: 18821
Let's say I have an array r
of dimension (n, m)
. I would like to shuffle the columns of that array.
If I use numpy.random.shuffle(r)
it shuffles the lines. How can I only shuffle the columns? So that the first column become the second one and the third the first, etc, randomly.
Example:
input:
array([[ 1, 20, 100],
[ 2, 31, 401],
[ 8, 11, 108]])
output:
array([[ 20, 1, 100],
[ 31, 2, 401],
[ 11, 8, 108]])
Upvotes: 26
Views: 31694
Reputation: 3349
numpy.random.Generator.shuffle
has an axis
parameter. This will shuffle in place:
rng = np.random.default_rng()
rng.shuffle(arr, axis=1)
https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.shuffle.html
Upvotes: 0
Reputation: 18821
One approach is to shuffle the transposed array:
np.random.shuffle(np.transpose(r))
Another approach (see YXD's answer https://stackoverflow.com/a/20546567/1787973) is to generate a list of permutations to retrieve the columns in that order:
r = r[:, np.random.permutation(r.shape[1])]
Performance-wise, the second approach is faster.
Upvotes: 29
Reputation: 1193
There is another way, which does not use transposition and is apparently faster:
np.take(r, np.random.permutation(r.shape[1]), axis=1, out=r)
CPU times: user 1.14 ms, sys: 1.03 ms, total: 2.17 ms. Wall time: 3.89 ms
The approach in other answers: np.random.shuffle(r.T)
CPU times: user 2.24 ms, sys: 0 ns, total: 2.24 ms Wall time: 5.08 ms
I used r = np.arange(64*1000).reshape(64, 1000)
as an input.
Upvotes: 2
Reputation: 21
>>> print(s0)
>>> [[0. 1. 0. 1.]
[0. 1. 0. 0.]
[0. 1. 0. 1.]
[0. 0. 0. 1.]]
>>> print(np.random.permutation(s0.T).T)
>>> [[1. 0. 1. 0.]
[0. 0. 1. 0.]
[1. 0. 1. 0.]
[1. 0. 0. 0.]]
np.random.permutation(), does the row permutation.
Upvotes: 2
Reputation: 217
So, one step further from your answer:
Edit: I very easily could be mistaken how this is working, so I'm inserting my understanding of the state of the matrix at each step.
r == 1 2 3
4 5 6
6 7 8
r = np.transpose(r)
r == 1 4 6
2 5 7
3 6 8 # Columns are now rows
np.random.shuffle(r)
r == 2 5 7
3 6 8
1 4 6 # Columns-as-rows are shuffled
r = np.transpose(r)
r == 2 3 1
5 6 4
7 8 6 # Columns are columns again, shuffled.
which would then be back in the proper shape, with the columns rearranged.
The transpose of the transpose of a matrix == that matrix, or, [A^T]^T == A. So, you'd need to do a second transpose after the shuffle (because a transpose is not a shuffle) in order for it to be in its proper shape again.
Edit: The OP's answer skips storing the transpositions and instead lets the shuffle operate on r as if it were.
Upvotes: 5
Reputation: 18693
In general if you want to shuffle a numpy array along axis i
:
def shuffle(x, axis = 0):
n_axis = len(x.shape)
t = np.arange(n_axis)
t[0] = axis
t[axis] = 0
xt = np.transpose(x.copy(), t)
np.random.shuffle(xt)
shuffled_x = np.transpose(xt, t)
return shuffled_x
shuffle(array, axis=i)
Upvotes: 2
Reputation: 32511
For a general axis you could follow the pattern:
>>> import numpy as np
>>>
>>> a = np.array([[ 1, 20, 100, 4],
... [ 2, 31, 401, 5],
... [ 8, 11, 108, 6]])
>>>
>>> print a[:, np.random.permutation(a.shape[1])]
[[ 4 1 20 100]
[ 5 2 31 401]
[ 6 8 11 108]]
>>>
>>> print a[np.random.permutation(a.shape[0]), :]
[[ 1 20 100 4]
[ 2 31 401 5]
[ 8 11 108 6]]
>>>
Upvotes: 6