Reputation: 19
I have a matrix as 2d-np.array and I would like to remove all rows that contain an element x in a specific column. My goal is to return a matrix without these rows, so it should be smaller. My function looks like this:
def delete_rows(matrix, x, col):
for i in range(matrix.shape[0]-1):
if(matrix[i,col] == x):
np.delete(matrix, i, axis = 0)
return matrix
Sadly in my test the shape of the matrix stayed the same after removing rows. I think the deleted rows were substituted by rows with 0s. Any advice on how I can achieve my goal?
Upvotes: 0
Views: 3654
Reputation: 19322
EDIT: added condition for the specific column to check
You don't need to use any apply methods for this. It can be solved with basic boolean indexing as follows -
arr[~(arr[:,col] == val),:]
arr[:,col]
selects the specific column from arrayarr[:,col] == val
checks for the value and returns True
where it exists, else False
~(arr[:,col] == val)
inverses the True
and False
arr[~(arr[:,col] == val),:]
keeps only the rows which have the boolean index as True
and discards all False
arr = np.array([[12, 10, 12, 0, 9, 4, 12, 11],
[ 3, 10, 14, 5, 4, 3, 6, 6],
[12, 10, 1, 0, 5, 7, 5, 10],
[12, 8, 14, 14, 12, 3, 14, 10],
[ 9, 14, 3, 8, 1, 10, 9, 6],
[10, 3, 11, 3, 12, 13, 11, 10],
[ 0, 6, 8, 8, 5, 5, 1, 10], #<- this to remove
[13, 6, 1, 10, 7, 10, 10, 13],
[ 3, 3, 8, 10, 13, 0, 0, 10], #<- this to remove
[ 6, 2, 13, 5, 8, 2, 8, 10]])
# ^
# this column to check
#boolean indexing approach
val, col = 8,2 #value to check is 8 and column to check is 2
out = arr[~(arr[:,col] == val),:] #<-----
out
array([[12, 10, 12, 0, 9, 4, 12, 11],
[ 3, 10, 14, 5, 4, 3, 6, 6],
[12, 10, 1, 0, 5, 7, 5, 10],
[12, 8, 14, 14, 12, 3, 14, 10],
[ 9, 14, 3, 8, 1, 10, 9, 6],
[10, 3, 11, 3, 12, 13, 11, 10],
[13, 6, 1, 10, 7, 10, 10, 13],
[ 6, 2, 13, 5, 8, 2, 8, 10]])
If you want to check for the value in all columns then try this -
arr[~(arr == val).any(1),:]
And if you want to keep ONLY rows with the value instead, just remove ~
from the condition.
arr[(arr[:,col] == val),:]
If you want to remove the column as well, using np.delete
-
np.delete(arr[~(arr[:,col] == val),], col, axis=1)
Note: You cannot remove both rows and columns at once using
np.delete
so if you plan to use it, you will need to donp.delete
two times once for axis = 0 (rows) and once for axis = 1 (columns)
Upvotes: 2
Reputation: 3583
Assuming you have an array like this:
array([[12, 5, 0, 3, 11, 3, 7, 9, 3, 5],
[ 2, 4, 7, 6, 8, 8, 12, 10, 1, 6],
[ 7, 7, 14, 8, 1, 5, 9, 13, 8, 9],
[ 4, 3, 0, 3, 5, 14, 0, 2, 3, 8],
[ 1, 3, 13, 3, 3, 14, 7, 0, 1, 9],
[ 9, 0, 10, 4, 7, 3, 14, 11, 2, 7],
[12, 2, 0, 0, 4, 5, 5, 6, 8, 4],
[ 1, 4, 9, 10, 10, 8, 1, 1, 7, 9],
[ 9, 3, 6, 7, 11, 14, 2, 11, 0, 14],
[ 3, 5, 12, 9, 10, 4, 11, 4, 6, 4]])
You can remove all rows containing a 3 like this:
row_mask = np.apply_along_axis(np.any, 1, arr == 3)
arr = arr[~row_mask]
Your new array looks like this
array([[ 2, 4, 7, 6, 8, 8, 12, 10, 1, 6],
[ 7, 7, 14, 8, 1, 5, 9, 13, 8, 9],
[12, 2, 0, 0, 4, 5, 5, 6, 8, 4],
[ 1, 4, 9, 10, 10, 8, 1, 1, 7, 9]])
Upvotes: 2
Reputation: 2785
This can be simply done in one line:
import numpy as np
def delete_rows(matrix, x, col):
return matrix[matrix[:,col]!=x,:]
For example, if we want to remove all the rows that contain 5 in the second column from matrix A
:
>>> A = np.array([[1,2,3], [4,5,6], [7,8,9]])
>>> print(delete_rows(A, 5, 1))
[[1 2 3]
[7 8 9]]
Upvotes: 1