Reputation: 51

python difference between array(10,1) array(10,)

I'm trying to load MNIST dataset into arrays. When I use (X_train, y_train), (X_test, y_test)= mnist.load_data() I get an array y_test(10000,) but I want it to be in the shape of (10000,1). What is the difference between array(10000,1) and array(10000,)? How can I convert the first array to the second array?

Upvotes: 4

Answers (2)

CyTex

Reputation: 119

An array with a size of (10,1) is a 2D array containing empty columns.
An array with a size of (10,) is a 1D array.

To convert (10,1) to (10,), you can simply collapse the columns. For example, we take the x array, which has x.shape = (10,1). now using x[:,0] you can collapse the columns and x[:,0].shape = (10,).

To convert (10,) to (10,1), you can add a dimension by using np.newaxis. So, after import numpy as np, assuming we are using numpy arrays here. Take a y array for example, which has y.shape = (10,). Using y[:, np.newaxis], you can a new array with the shape of (10,1).

Upvotes: 0

Twald

Reputation: 142

Your first Array with shape (10000,) is a 1-Dimensional np.ndarray. Since the shape attribute of numpy Arrays is a Tuple and a tuple of length 1 needs a trailing comma the shape is (10000,) and not (10000) (which would be an int). So currently your data looks like this:

import numpy as np
a = np.arange(5) #  >>> array([0, 1, 2, 3, 4]
print(a.shape) #    >>> (5,)

What you want is an 2-Dimensional array with shape of (10000, 1). Adding a dimension of length 1 doesn't require any additional data, it is basically and "empty" dimension. To add an dimension to an existing array you can use either np.expand_dims() or np.reshape().

Using np.expand_dims:

import numpy as np
b = np.array(np.arange(5))  # >>> array([0, 1, 2, 3, 4])
b = np.expand_dims(b, axis=1)  # >>> array([[0],[1],[2],[3],[4]])

The function was specifically made for the purpose of adding empty dimensions to arrays. The axis keyword specifies which position the newly added dimension will occupy.

Using np.reshape:

import numpy as np
a = np.arange(5) 
X_test_reshaped = np.reshape(a, shape=[-1, 1]) # >>> array([[0],[1],[2],[3],[4]])

The shape=[-1, 1] specifies how the new shape should look like after the reshape operation. The -1 itself will be replaced by the shape that 'fits the data' by numpy internally. Reshape is a more powerful function than expand_dims and can be used in many different ways. You can read more on other uses of it in the numpy docs. numpy.reshape()

Upvotes: 5

python difference between array(10,1) array(10,)

Answers (2)

Related Questions