Reputation: 4090
I have a dataframe with groups and values. Firstly, I calculate 99% quantile per group. Now, I want to remove the values above the 99% quantile for every group.
df<-data.frame(group = rep(c("A", "B"), each = 4),
value = c(c(6,5,80,4,60)*10,3,5,4))
# data
group value
1 A 60
2 A 50
3 A 800
4 A 40
5 B 600
6 B 3
7 B 5
8 B 4
Calculate quantils for individual groups
quant<-aggregate(df$value, by = list(df$group), FUN = quantile, probs = 0.99)
> quant
Group.1 x
1 A 777.80
2 B 582.15
I tried to apply the vector of quantiles to select lower values. However, it miss the group specification..
df[df$value < quant$x,]
Expected result:
group value
1 A 60
2 A 50
4 A 40
5 B 3
6 B 5
7 B 4
How to apply vector of quantiles to keep only values below 99% by group in data frame?
Upvotes: 3
Views: 979
Reputation: 887223
We can do a filter
after grouping
library(dplyr)
df %>%
group_by(group) %>%
filter(value < quantile(value, probs = 0.99))
# A tibble: 6 x 2
# Groups: group [2]
# group value
# <fctr> <dbl>
#1 A 60
#2 A 50
#3 A 40
#4 B 3
#5 B 5
#6 B 4
Or similar syntax with data.table
library(data.table)
setDT(df)[, .(value = value[value < quantile(value, probs = 0.99)]), by = group]
Or using base R
using ave
df[with(df, as.logical(ave(value, group, FUN= function(x) x <quantile(x, probs = 0.99)))), ]
Upvotes: 5