Reputation: 1037
How can I compute the sum of values that is above the 99th percentile. And then divide it by the total values to get the percentage of values above the 99th percentile. For example dataset mtcars
> summary(mtcars$hp)
Min. 1st Qu. Median Mean 3rd Qu. Max.
52.0 96.5 123.0 146.7 180.0 335.0
> quantile(mtcars$hp, 0.99)
99%
312.99
> sum(mtcars$hp)
[1] 4694
So from here it's like summing up all the values that is greater than 312.99 and then divide it by 4694.
Upvotes: 1
Views: 377
Reputation: 39667
You will get the a vector indicating if a value is above the 99th percentile with the condition mtcars$hp > quantile(mtcars$hp, 0.99)
, which can be used to subset mtcars$hp
which you can sum up.
sum(mtcars$hp[mtcars$hp > quantile(mtcars$hp, 0.99)]) / sum(mtcars$hp)
#[1] 0.0713677
To make it a percentage with 1 decimal places multiply with 100 and use round
like:
round(sum(mtcars$hp[mtcars$hp>quantile(mtcars$hp, 0.99)])/sum(mtcars$hp)*100,1)
#[1] 7.1
Upvotes: 2