How to interpret the reconstruction MSE from H2O anomaly detection?

Question

I am using h2o for anomaly detection in the data. The data contains several continuous and categorical features and the label could either be 0 or 1. Now, because the count of 1s is less than 1%, I am trying out anomaly detection technique instead of using usual classification methods. However, in the end I get MSE calculated per row of the data and I am not sure how to interpret it to be able to say that actual label is 0 but because of it is an anomaly and should be 1.

The code I am using so far:

features <- names(train.df)[!names(train.df) %in% c("label")]
train.df <- subset(train.df, label==0)
train.h <- as.h2o(train.df)

mod.dl <- h2o.deeplearning(
  x=features,
  autoencoder=TRUE,
  training_frame=train.h,
  activation=c("Tanh"),
  hidden=c(10,10), epochs=20, adaptive_rate=FALSE,
  variable_importances=TRUE, 
  l1=1e-4, l2=1e-4,
  sparse=TRUE
)

pred.oc <- as.data.frame(h2o.anomaly(mod.dl.oc, train.h.oc))

head(pred.oc):

  Reconstruction.MSE
1        0.012059304
2        0.014490905
3        0.011002231
4        0.013142910
5        0.009631915
6        0.012897779

How to interpret the reconstruction MSE from H2O anomaly detection?

Answers (1)

Related Questions