Reputation: 51
I am a new user to numpy and I was using numpy delete, where it mention that to delete horizontal row we should use axis=0 but in other documentation of numpy glossary, it says horizontal axis is 1. It would be great if someone can let me know what is wrong in my understanding.
Upvotes: 3
Views: 3589
Reputation: 8608
An array is a systematic way of structuring numbers in grids of any dimensionality. The grid directions have labels, and these labels come from a convention of how new dimensions are added to a grid.
Here's the convention:
The simplest such grid is a 0-dimensional (0D) array, which has no axes
and can only hold a scalar. This is a 0D array:
42
If we start putting scalars into a list we get a 1D array. This new grid only has one axis, and if we want to label that axis with a number, we better start with something simple - like axis=0
! A 1D array could be:
# ----0--->
[42, π, √2]
Now we want to create an array of 1D arrays, which will give us a 2D array. The horizontal axis will still be 0, but the new vertical axis will get the next lowest number we know, axis=1
. Here's what it could look like:
# ----0---->
[[42, π, √2], # |
[1, 2, 3], # 1
[10, 20, 30]] # V
The true beauty is that this generalizes to infinity. If we need a box of numbers we'd create a 3D array by stacking 2D arrays, and the direction that traces the depth of the box would naturally have to be axis=2
. If we wanted a 4D array, we would just make a list of boxes (3D arrays), and call every box using an index along axis=3
. This can go on forever.
In NumPy:
Any function/method that takes an axis
-argument uses this convention. For a 2D array this means that doing something like np.delete(X, [1, 2, 3], axis=0)
will iterate over arrays extruded along the 0'th axis, to return X
without rows 1, 2 and 3. The same logic applies for getting values from an array.
X[rows_along_0th_axis, columns_along_1st_axis, ..., vectors_along_nth_axis]
Upvotes: 6
Reputation: 518
Taking from the links that you provided, here the excerpts from numpy delete and glossary that probably caused you some confusions and the clarification in the following.
>>> arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]]) >>> arr array([[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]]) >>> np.delete(arr, 1, 0) array([[ 1, 2, 3, 4], [ 9, 10, 11, 12]])
the first running vertically downwards across rows (axis 0), and the second running horizontally across columns (axis 1)
I think the confusion derives from the words vertically and horizontally in the second excerpt.
What the second excerpt means is that by setting axis
it is possible to decide over which dimension to move. For example, in a 2d matrix, axis=0
corresponds to iterating over the rows (thus moving vertically over the array), while axis=1
corresponds
to iterating over columns (so moving horizontally over the array). It does not say that axis=1
corresponds to the horizontal axis as the OP understood.
The delete
function follows the above description, as indeed, by using np.delete(arr, 1, axis=0)
, the function iterates over the rows, and deletes the row with index 1. If, instead, columns should be deleted, then axis=1
. For example, on the same array arr
>>> np.delete(arr, [0,1,4], axis=1)
array([[ 3, 4],
[ 7, 8],
[11, 12]])
in which delete
iterates over the columns, and the columns with indices 0, 1 are deleted, and nothing else is deleted as column with index 4 does not exist.
Upvotes: 5