StudentOfScience
StudentOfScience

Reputation: 809

Removing matrix rows if values of a cloumn are outliers

There is a really cool and easy function by @aL3xa here but that is for a vector.

I have a matrix, and say column 2, is a variable that I want to chop off outliers and the associated row. There is a package outliers that I would like to use its algorithms, but they seem to be for a vector too. Any suggestions?

thanks

Upvotes: 0

Views: 1124

Answers (1)

Jota
Jota

Reputation: 17621

Taking from some of the code from the question you linked:

# @aL3xa's function
remove_outliers <- function(x, na.rm = TRUE, ...) {
  qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm, ...)
  H <- 1.5 * IQR(x, na.rm = na.rm)
  y <- x
  y[x < (qnt[1] - H)] <- NA
  y[x > (qnt[2] + H)] <- NA
  y
}

set.seed(1)
x <- as.data.frame(matrix(rnorm(10000),ncol=100))  # 100 x 100 data frame
y <- remove_outliers(x[,2]) # look for outliers in columns 2

newx<-cbind(x,y)

newx2<-x[!is.na(x$y),] 

Upvotes: 2

Related Questions