Reputation: 3
If my data frame is called "houses" and I want to exclude the top 5% and bottom 5% of the column Sale_Price, how do I do that?
houses[quantile(Sale_Price, c(.05, .95))
I tried this code, but I'm getting errors.
Upvotes: 0
Views: 1437
Reputation: 388817
Using dplyr
, we can do
library(dplyr)
houses %>% filter(between(Sale_Price,
quantile(Sale_Price, 0.05), quantile(Sale_Price, 0.95)))
Or with data.table
library(data.table)
setDT(houses)
houses[Sale_Price %between% quantile(Sale_Price, c(.05, .95))]
Upvotes: 1
Reputation: 5620
Here is some data that I assume is similar to what you have.
houses<-data.frame(Sale_Price=rnorm(100,50,5))
The code to stay only with the prices between the upper and lower 5 % of the Sale_Price
values
#Calculate 0.05 and 0.95 percentiles
quants<-quantile(houses$Sale_Price, probs = c(0.05, 0.95))
#Subset according to the two percentiles
df1 <- houses$Sale_Price[houses$Sale_Price > quants[1] & houses$Sale_Price < quants[2]]
Upvotes: 1