Outlier Handling in data mining

Question

I have one outier in Body Mass Index column which is very far from other data. The second maximum is 38.1, whereas the outlier is 294. It is actually 29.4 and the error occurred while collecting the data. I don't want to delete the row as I have a limited number of data. Can anyone tell a best technical approach to deal with this problem? Is it a good way to treat the value as missing and apply some method like Expectation Maximization Imputation or Bayesian Multiple Imputation? Please help me to solve the issue. Thanks

Has QUIT--Anony-Mousse · Accepted Answer

Detect bad data, replaced it with any data imputation technique you like, if necessary.

Of course it is better if you could just leave the bad data in, and design your overall approach robust enough to handle this.

Outlier Handling in data mining

Answers (2)

Related Questions