Reputation: 187
I have a matrix of data, where I want to check whether or not the absolute value of each column falls within a certain range. Moreover, I would like to calculate the proportion of times it occurs across all columns. I know how to do this manually but I would like to write this generally outside of a loop so that any time the user gives me a matrix X and y of any size that it works. The only additional piece of information is that the number of columns of X will always be the same length of y. I also would like to do this in base R if possible. Here is my R code:
set.seed(42)
# Made up data
x <- matrix(rnorm(27), nrow = 9)
y <- c(.2, .5, 2)
> sum(abs(x[,1]) <= y[1] & abs(x[,2]) <= y[2] & abs(x[,3]) <= y[3]) / nrow(x)
[1] 0.2222222
So ideally I would want something like
sum(abs(x) <= y) / nrow(x)
Upvotes: 1
Views: 34
Reputation: 160417
sum(rowSums(t(t(abs(x)) <= y)) == ncol(x)) / nrow(x)
# [1] 0.2222222
Walk-through:
Unfortunately, x > y
recycles y
across x
, but column-wise, so it is effectively doing c(x[1,1] > y[1], x[2,1] > y[2], x[3,1] > y[3], x[4,1] > y[1], ...)
, which is not what we want. We can t
ranspose x
so that the get the correct recycling of y
... and then transpose it again to get it back in the same shape as x
(not strictly required).
t(t(abs(x)) <= y)
# [,1] [,2] [,3]
# [1,] FALSE TRUE FALSE
# [2,] FALSE FALSE TRUE
# [3,] FALSE FALSE TRUE
# [4,] FALSE FALSE TRUE
# [5,] FALSE TRUE TRUE
# [6,] TRUE TRUE TRUE
# [7,] FALSE FALSE TRUE
# [8,] TRUE TRUE TRUE
# [9,] FALSE FALSE TRUE
Now we want to know how many rows have as many TRUE
s as x
has columns, done with rowSums(.) == ncol(x)
. And the sum of all of these with sum(.)
.
Upvotes: 1