Reputation: 1947
Let's create some artificial data and their 0.99 quantiles.
set.seed(42)
x = data.frame("Norm" = rnorm(100),
"Unif" = runif(100),
"Exp" = rexp(100))
quants <- apply(x, 2, quantile, 0.99)
I want to check without loop which elements of the variables are bigger than 0.99 quantile.
So first variable should be compared with first element of quants
, second with second and third with third.
Intuitively I used: x > quants
and it's good that I checked the outcome, because R seems to interpret this command as something else.
e.g.
> head(x > quants)
Norm Unif Exp
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,] FALSE FALSE TRUE
[4,] FALSE FALSE FALSE
[5,] FALSE FALSE FALSE
[6,] FALSE FALSE TRUE
As you can see third element of Exp should signalize that it's bigger than 0.99 quantile. However:
> x[3, ][3] > quants[3]
Exp
3 FALSE
Gives false. Do you know how can I fix this problem ? I tried to play with apply but wasn't sure how to use it properly in this case.
Upvotes: 1
Views: 55
Reputation: 26238
Actually when checking x > quants
R checks it columnwise instead of rowwise. First element of first row is checked with first quants, first element of second row is checked with second quants and so on. Hence when checking x[3,3], it is actually 203rd element in this iteration and is thus checked with second element of quants (203 %% 3 = 2)
. That's you're getting an error.
Also see
colSums(x > quants)
Norm Unif Exp
4 0 19
which locates the error in given syntax.
Upvotes: 2
Reputation: 3067
You could use purrr::map2_df
.
# there are two objects I am iterating
# x data.frame is referenced as .x
# quants vector is referenced as .y
purrr::map2_df(x, quants, ~ .x > .y)
Upvotes: 2
Reputation: 21938
I think the following code might help you get your desired output:
library(purrr)
set.seed(42)
x = data.frame("Norm" = rnorm(100),
"Unif" = runif(100),
"Exp" = rexp(100))
quants <- apply(x, 2, quantile, 0.99)
map2_dfr(x, quants, ~ .x > .y)
# A tibble: 100 x 3
Norm Unif Exp
<lgl> <lgl> <lgl>
1 FALSE FALSE FALSE
2 FALSE FALSE FALSE
3 FALSE FALSE FALSE
4 FALSE FALSE FALSE
5 FALSE FALSE FALSE
6 FALSE FALSE FALSE
7 FALSE FALSE FALSE
8 FALSE FALSE FALSE
9 FALSE FALSE FALSE
10 FALSE FALSE FALSE
# ... with 90 more rows
And here is another easy way if you want to stick to base R:
head(mapply(function(x, y) x > y, x, quants))
Norm Unif Exp
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,] FALSE FALSE FALSE
[4,] FALSE FALSE FALSE
[5,] FALSE FALSE FALSE
[6,] FALSE FALSE FALSE
Upvotes: 2
Reputation: 11128
How about this, Here x is your dataframe, quants the value from which you want comparision and function applied is greater than symbol. Sweep applied here on column wise hence 2:
sweep(x, 2,STATS=quants, `>`)
Upvotes: 3