TheAnalyst
TheAnalyst

Reputation: 33

Im having problems with returning a dataframe object in R from a function that removes outliers

I am trying to create a function that returns the remainder of a dataframe with the outlier removed for a specific column but the dataframe object that is returned is always empty no matter what column i use.

remove_outlier = function(dataframe,column){
  average = mean(dataframe[[column]])
  std = sd(dataframe[[column]])
  cutoff = 3 * std
  lower = average - cutoff
  upper = average + cutoff
  print(lower)
  new = dataframe[dataframe[[column]] > lower & dataframe[[column]] < lower]

  return(new)

}

testing = remove_outlier(BostonHousing,'age')

head(testing)

Upvotes: 1

Views: 72

Answers (1)

Reeza
Reeza

Reputation: 21294

 new = dataframe[dataframe[[column]] > lower & dataframe[[column]] < lower]

Since there's no equal sign there's no possible way to be greater than a value and lower but not equal at the same time. This line is incorrect, I suspect you intended to have upper there instead.

new = dataframe[dataframe[[column]] > lower & dataframe[[column]] < upper,]

EDIT: add a comma, thanks to u/maydin for the catch.

Upvotes: 3

Related Questions