Reputation: 51
I'm trying to load MNIST dataset into arrays. When I use (X_train, y_train), (X_test, y_test)= mnist.load_data() I get an array y_test(10000,) but I want it to be in the shape of (10000,1). What is the difference between array(10000,1) and array(10000,)? How can I convert the first array to the second array?
Upvotes: 4
Views: 6122
Reputation: 119
To convert (10,1) to (10,), you can simply collapse the columns. For example, we take the x
array, which has x.shape = (10,1)
. now using x[:,0]
you can collapse the columns and x[:,0].shape = (10,)
.
To convert (10,) to (10,1), you can add a dimension by using np.newaxis
. So, after import numpy as np
, assuming we are using numpy
arrays here. Take a y
array for example, which has y.shape = (10,)
. Using y[:, np.newaxis]
, you can a new array with the shape of (10,1).
Upvotes: 0
Reputation: 142
Your first Array with shape (10000,)
is a 1-Dimensional np.ndarray
.
Since the shape
attribute of numpy Arrays is a Tuple and a tuple of length 1 needs a trailing comma the shape is (10000,)
and not (10000)
(which would be an int). So currently your data looks like this:
import numpy as np
a = np.arange(5) # >>> array([0, 1, 2, 3, 4]
print(a.shape) # >>> (5,)
What you want is an 2-Dimensional array with shape of (10000, 1)
.
Adding a dimension of length 1 doesn't require any additional data, it is basically and "empty" dimension. To add an dimension to an existing array you can use either np.expand_dims()
or np.reshape()
.
Using np.expand_dims
:
import numpy as np
b = np.array(np.arange(5)) # >>> array([0, 1, 2, 3, 4])
b = np.expand_dims(b, axis=1) # >>> array([[0],[1],[2],[3],[4]])
The function was specifically made for the purpose of adding empty dimensions to arrays. The axis keyword specifies which position the newly added dimension will occupy.
Using np.reshape
:
import numpy as np
a = np.arange(5)
X_test_reshaped = np.reshape(a, shape=[-1, 1]) # >>> array([[0],[1],[2],[3],[4]])
The shape=[-1, 1]
specifies how the new shape should look like after the reshape operation. The -1 itself will be replaced by the shape that 'fits the data' by numpy internally.
Reshape is a more powerful function than expand_dims
and can be used in many different ways. You can read more on other uses of it in the numpy docs. numpy.reshape()
Upvotes: 5