Integerard
Integerard

Reputation: 13

Is it possible to perform the same shuffle on multiple numpy arrays in place?

I need to perform the same shuffle on multiple arrays simultaneously. These may be large arrays (multiple GBs), so it needs to be done in place. A simpler example:

Before shuffle:

Arr1: [[1 2 3]
       [4 5 6]]
Arr2: [0 1]

After shuffle:

Arr1: [[4 5 6]
       [1 2 3]]
Arr2: [1 0]

Upvotes: 0

Views: 428

Answers (1)

Integerard
Integerard

Reputation: 13

Here is the best solution I've found:

from numpy.random import RandomState
import sys

def shuffleDataAndLabelsInPlace ( arr1, arr2):
    seed = random.randint(0, sys.maxint)
    prng = RandomState(seed)
    prng.shuffle(arr1)
    prng = RandomState(seed)
    prng.shuffle(arr2)

# Example:
arr1= np.array([[1,2,3],[4,5,6]])
labels = np.array([0, 1])

print "Before shuffle"
print arr1
print arr2
print "After"
shuffleDataAndLabelsInPlace(arr1, arr2)
print arr1
print arr2

(Provides the same output as in the original question)

Upvotes: 1

Related Questions