Thomas Lee
Thomas Lee

Reputation: 383

Why the second dimension in a Numpy array is empty?

Why the output here

array = np.arange(3)
array.shape

is

(3,)

and not

(1,3)

What does the missing dimension means or equals?

Upvotes: 8

Views: 7299

Answers (1)

hpaulj
hpaulj

Reputation: 231385

In case there's confusion, (3,) doesn't mean there's a missing dimension. The comma is part of the standard Python notation for a single element tuple. Shapes (1,3), (3,), and (3,1) are distinct,

While they can contain the same 3 elements, their use in calculations (broadcasting) is different, their print format is different, and their list equivalent is different:

In [21]: np.array([1,2,3])
Out[21]: array([1, 2, 3])
In [22]: np.array([1,2,3]).tolist()
Out[22]: [1, 2, 3]
In [23]: np.array([1,2,3]).reshape(1,3).tolist()
Out[23]: [[1, 2, 3]]
In [24]: np.array([1,2,3]).reshape(3,1).tolist()
Out[24]: [[1], [2], [3]]

And we don't have to stop at adding just one singleton dimension:

In [25]: np.array([1,2,3]).reshape(1,3,1).tolist()
Out[25]: [[[1], [2], [3]]]
In [26]: np.array([1,2,3]).reshape(1,3,1,1).tolist()
Out[26]: [[[[1]], [[2]], [[3]]]]

In numpy an array can have 0, 1, 2 or more dimensions. 1 dimension is just as logical as 2.

In MATLAB a matrix always has 2 dim (or more), but it doesn't have to be that way. Strictly speaking MATLAB doesn't even have scalars. An array with shape (3,) is missing a dimension only if MATLAB is taken as the standard.

numpy is built on Python which as scalars, and lists (which can nest). How many dimensions does a Python list have?

If you want to get into history, MATLAB was developed as a front end to a set of Fortran linear algebra routines. Given the problems those routines solved the concept of matrix with 2 dimensions, and row vs column vectors made sense. It wasn't until version 3.something that MATLAB was generalized to allow more than 2 dimensions (in the late 1990s).

numpy is based on several attempts to provide arrays to Python (e.g. numeric). Those developers took a more general approach to arrays, one where 2d was an artificial constraint. That has precedence in computer languages and mathematics (and physics). APL was developed in the 1960s, first as a mathematical notation, and then as a computer language. Like numpy its arrays can be 0d or higher. (Since I used APL before I used MATLAB, the numpy approach feels quite natural.)


In APL there aren't separate lists or tuples. So the shape of an array, rho A is itself an array, and rho rho A is the number of dimensions of A, also called the rank.

http://docs.dyalog.com/14.0/Dyalog%20APL%20Idioms.pdf

Upvotes: 7

Related Questions