Reputation: 583
I have a matrix A that has a large number of rows and columns (below one example of such a matrix) that occasionally has a full row of 0 values (as in row 4 at this particular example).
I want to have a function that checks all rows of A and allows me to perform an operation on each element of these rows. Is there an easy way to do that?
I also wonder if matrix is the right data structure for this. It feels not quite right, perhaps data frames are better for that?
A = matrix(
c(0, 0, 1, 0, 0, 0, 0,
1, 0, 1, 1, 0, 0, 0,
0, 0, 0, 1, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 1, 1, 0, 1), nrow=7,ncol=7,byrow = TRUE)
For every row of that matrix I want to determine if there are only 0's in it. If so, I want to set (for each element) the value 1/N (where N is the ncol(A)).
Sudo code:
If (sum(row of A) == 0) then row_of_A = 1/ncol(A)
Upvotes: 1
Views: 1303
Reputation: 132576
Apparently you want this:
A[rowSums(A != 0) == 0,] <- 1/ncol(A)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#[1,] 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000 0.0000000
#[2,] 1.0000000 0.0000000 1.0000000 1.0000000 0.0000000 0.0000000 0.0000000
#[3,] 0.0000000 0.0000000 0.0000000 1.0000000 1.0000000 0.0000000 0.0000000
#[4,] 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571
#[5,] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 1.0000000 1.0000000
#[6,] 0.0000000 0.0000000 0.0000000 1.0000000 1.0000000 0.0000000 1.0000000
#[7,] 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000 0.0000000
Explanation:
A != 0
checks all matrix elements and returns a logical matrix with TRUE
for non-zero elements.FALSE
/TRUE
is coerced to 0/1.Benchmarks to show that apply
is slower:
set.seed(42); A = matrix(sample(0:1, 5e4, TRUE), nrow=1e4)
library(microbenchmark)
microbenchmark(A[rowSums(A != 0) == 0,],
A[!apply(A != 0, 1, any),],
A[apply(A == 0, 1, all),])
#Unit: microseconds
# expr min lq mean median uq max neval cld
# A[rowSums(A != 0) == 0, ] 572.202 593.298 620.7931 624.248 629.638 780.387 100 a
# A[!apply(A != 0, 1, any), ] 14978.248 16124.652 17261.9530 17441.054 18129.975 22469.219 100 b
# A[apply(A == 0, 1, all), ] 15182.122 16149.751 17616.8010 16561.657 17997.703 75148.079 100 b
Upvotes: 4