Reputation: 3819

clear authoritative explanation of numpy axis numbers?

I am getting confused by contradictory explanations of what exactly the term axis means in numpy and how these constructs are numbered.

Here's one explanation:
Axes are defined for arrays with more than one dimension.
A 2-dimensional array has two corresponding axes:
the first running vertically downwards across rows (axis 0), and
the second running horizontally across columns (axis 1).

So, in this 3x4 matrix ...

>>> b = np.arange(12).reshape(3,4)
>>> b
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

(axis 0) is the 3 rows
(axis 1) is the 4 columns

So the rule might be ...

In an MxN matrix, (axis 0) is M and (axis 1) is N.

Is this correct?

So, in a 3 dimensional matrix AxBxC (axis 0) is A
(axis 1) is B
(axis 2) is C

Is this correct?

Upvotes: 5

Answers (4)

sui_generis

Reputation: 11

A smart way to remember this is that axis =0 collapses the rows Whilst axis=1 collapses the columns

a three 3*4 array when operated upon with sum function and axis =0 would yield 1*4 output that is all the rows would be collapsed and the aggregation would be done column-wise.

The same function when performed with axis=1 would collapse the columns and yield 3*1 output with aggregation along rows.

the image link would further help assimilating this concept. Example for understanding

Upvotes: 1

Debashis Sahoo

Reputation: 5962

If someone need a clear idea, here is the picture:

Upvotes: 5

lvlv

Reputation: 71

Although it is possible to imagine this in 3D, I personally feel it is difficult to imagine when we go to 4D or 5D... So I decide to give up but rather think about this in an implementation perspective. Basically, it has N-number of nested for loop, and if we want to reduce one specific axis, we just work on the for loop of that axis. For example, if given a 3x3x3 tensor, axis = 0 is the for loop of a[i][x][x], axis = 1 is to loop a[x][i][x], axis = 2 is to loop a[x][x][i]. 4D, 5D, ... should have the same way.

def my_reduce_max(a, axis=0):
    b = [[-1 for _ in range(3)] for _ in range(3)]
    for j in range(3):
        for k in range(3):
            tmp_max = -1
            for i in range(3):
                if axis == 0:
                    get_value = a[i][j][k]
                elif axis == 1:
                    get_value = a[j][i][k]
                else:
                    get_value = a[j][k][i]
                tmp_max = max(get_value, tmp_max)
            b[j][k] = tmp_max

    return b

a = np.arange(27).reshape((3,3,3))
print(a)
my_reduce_max(a, 2)

Upvotes: 0

ali_m

Reputation: 74252

Everything you said is correct, with the exception of

Axes are defined for arrays with more than one dimension.

Axes are also defined for one dimensional arrays - there is just one of them (i.e. axis 0).

One intuitive way to think about axes is to consider what happens when you apply a reduction operation over one axis, such as summation. For example, suppose I have some array x:

x = np.arange(60).reshape(3, 4, 5)

If I compute x.sum(0) I am "collapsing" x over the first dimension (i.e. axis 0), so I end up with a (4, 5) array. Likewise, x.sum(1) gives me a (3, 5) array and x.sum(2) gives me a (3, 4) array.

An integer index into a single axis of x will also give me an output with one fewer axis. For example, x[0, :, :] gives me the first "row" of x, which has shape (4, 5), x[:, 0, :] gives me the first "column" with shape (3, 5), and x[:, :, 0] gives me the first slice in the third dimension of x with shape (3, 4).

Upvotes: 7

clear authoritative explanation of numpy axis numbers?

Answers (4)

Related Questions