codeforfun
codeforfun

Reputation: 187

Outlier cutoff in R

I am trying to cut off the outliers of a variable of a dataframe however it does not perform as expected:

outlier_cutoff1 <- quantile(myd$nov, 0.75) + 1.5 * IQR(myd$nov)
index_outlier1 <- which(myd$nov > outlier_cutoff1)
mydnov <- myd[-index_outlier1, ]

this code does not give error but does not change the outlier values.

Upvotes: 0

Views: 197

Answers (2)

hachiko
hachiko

Reputation: 757

I think this is what you are looking for. Let me know that it works for you.I couldn't test fully without a reproducible example.

myd_wo_outliers <- subset(myd, myd$nov > (Q[1] - 1.5*iqr) & myd$nov < (Q[2]+1.5*iqr))

Check out this page for more details.

Upvotes: 0

jsizzle
jsizzle

Reputation: 78

There is no need for which here.

Looking at your code, I think you can remove the "outliers" with the below:

outlier_cutoff1 <- quantile(myd$nov, 0.75) + 1.5 * IQR(myd$nov)
index_outlier1 <- (myd$nov > outlier_cutoff1)
mydnov <- myd[-index_outlier1, ]

Here's a reproducible example that verifiably works (with a vector).

set.seed(123)
nov <- rnorm(500)

outlier_cutoff1 <- quantile(nov, 0.75) + 1.5 * IQR(nov)
  #This is 2.574977 
index_outlier1 <- nov > outlier_cutoff1
  #This returns a logical vector inticating when each value is greater than 2.574977 

mydnov <- nov[-index_outlier1]

length(nov)  #500

length(mydnov)  #499, one was removed

Upvotes: 1

Related Questions