Skye
Skye

Reputation: 57

LIME ML Interpreter mode Classification or Regression for Isolation Forest (Anomaly Detection)

i am trying to find anomalies in my dataset of 1000+ documents. I'm using LIME ML Interpreter to be able to explain the model (Isolation Forest) predictions. In one parameter "mode" i am able to choose between Classification and Regression. I do not have a set of documents with a known anomaly. Since Isolation Forest is a unsupervised learning method and classifcation is a type of supervised learning which is used to clasify observations into two or more classses i ended up using regression. On the other side i have the outcome anomaly or no anomaly.

What is right to use here?

Best Regards, Elle

Upvotes: 0

Views: 1055

Answers (3)

FlyingPickle
FlyingPickle

Reputation: 1133

The other option I see to this is to hold out 10-20% of the data set during IsoForest tree building. On this holdout to score the model and get the anomaly score (or avg tree depth) and build the explainer on this. Then in scoring new data, LIME will treat it as a regression problem...I am not sure how well this will work though...

Upvotes: 0

Jon Nordby
Jon Nordby

Reputation: 6299

Not directly about LIME, but Shapley values can be used to create similar explanations for IsolationForest. See this answer.

Upvotes: 0

Reshma Godse
Reshma Godse

Reputation: 11

For us, what we have done is as follows:

  1. Use Isolation Forest to get anomalies.
  2. Treat 1 and -1 returned by Isolation Forest as class labels and build a Random Forest classifier.
  3. Pass this Random Forest classifier to LIME to get explanation of anomalous points.

We are also trying to find a better option instead of building second level Random Forest classifier.

Upvotes: 1

Related Questions