Reputation: 2167
I am trying to index the testdata with just the labels that equal 2 and 3. However, when I run this code, it turns my array from 2D (100 x 100) into 3D (100 x 1 x 100).
Can anyone explain why it is doing this? The last line in the code is the culprit, but I am not sure why it is happening.
labels = testdata[:,0]
num2 = numpy.nonzero(labels == 2)
num2 = numpy.transpose(num2)
num3 = numpy.nonzero(labels == 3)
num3 = numpy.transpose(num3)
num = numpy.vstack([num2,num3])
testdata = testdata[num,:]
Upvotes: 2
Views: 64
Reputation: 231395
When there are puzzles, print intermediate values. Better yet, run a test case in a interactive shell so you can check each value, and understand what is going on. Keep track of the shapes.
Looks like labels
is a 1d array of numbers like:
In [212]: labels=np.array([0,1,2,2,3,2,0,3,2])
indexes where labels
is 2 or 3:
In [213]: num2=np.nonzero(labels==2)
In [214]: num2
Out[214]: (array([2, 3, 5, 8], dtype=int32),)
In [215]: num3=np.nonzero(labels==3)
Here's a key step - what is the purpose of transpose
. Note the num2
is a tuple with one 1d array.
In [216]: num2=np.transpose(num2)
In [217]: num3=np.transpose(num3)
In [218]: num2
Out[218]:
array([[2],
[3],
[5],
[8]], dtype=int32)
After the transpose num2
is a column array, (4,1) shape.
Joining them vertically produces a (6,1) array:
In [220]: num=np.vstack([num2,num3])
In [221]: num
Out[221]:
array([[2],
[3],
[5],
[8],
[4],
[7]], dtype=int32)
In [222]: num.shape
Out[222]: (6, 1)
In [223]: labels[num]
Out[223]:
array([[2],
[2],
[2],
[2],
[3],
[3]])
In [224]: labels[num].shape
Out[224]: (6, 1)
Indexing the 1d array with that array produces another array of the same shape as the index. Indexing x[num,:]
does the same thing, but with the added last dimension.
If I index a (3,4) array with a (2,5) array in the 1st dimension, the result is a (2,5,4) array:
In [227]: np.ones((3,4))[np.ones((2,5),int),:].shape
Out[227]: (2, 5, 4)
Upvotes: 1