Reputation: 162
I am using train_test_split. My training set, X[], is an array of file paths. Then I have another array y[] that is composed on one hot encoded labels. They are related by the array row index. So if I pass X it looks like this:
Index path
4, data\djip2\DJIP2.5844MHz.10MSPS.fc32.2016-07-01_000000000001.npy
20, data\taigentank\USRP-2_420GHz-1MSps-1MHztaigentank1_000000000000.npy
2, data\866_300_1\USRP-866_300MHz-1MSps-1MHz_lte_1_000000000002.npy
And y[] looks like this:
Index label
4, 00000001
20, 00000010
2, 01000000
These arrays are then passed to a batch generator after being randomized. At the batch generator I need to make sure each X array value can be mapped back to a y[] array label.
So, I want to be able to get the X array indexes which are now in a random order like:
2, path
4, path
20, path
And pass them to another function in this order. I need the indices because I need to pass the path as well as its associated label. Is there a simple way to do this with numpy?
Upvotes: 1
Views: 137
Reputation: 656
One solution might be:
n = range(numberOfInstances)
which creates a list of integers like [0,1,2,3,4...numberOfInstances-1]. Then shuffle the list
random.shuffle(n)
Save this list as a numpy array
n_np = np.array(n)
and finally reorder your data and groundtruth accordingly like
y = y[n_np]
x = x[n_np]
which should reorder both arrays in regarding the same way.
I hoipe this helps :)! I am a little bit confused why you have an unordered sequence of IDs which somehow are shuffled again....
Upvotes: 1