assafmo
assafmo

Reputation: 1086

Using ELKI MiniGUI for anomaly detection with training set and test set

I have:

  1. A file training.arff which contains only samples with normal behavior.

  2. A file test.arff which contains samples both with normal and abnormal behavior.

I would like to use ELKI MiniGUI for anomaly detection using semi-supervised learning.

I believe usually I should build/train a model using training.arff and then apply the model on the test.arff.

It does not matter which algorithm I use.

I just can't seem to find where to put those two files in ELKI MiniGUI so I would get my desired result. (There's onlydbc.in)

*PS: after a week of trying using weka I gave up, but I am not limited to ELKI.

Thanks!!

Upvotes: 1

Views: 394

Answers (1)

Erich Schubert
Erich Schubert

Reputation: 8715

Your scenario is a supervised learning approach.

ELKI currently only includes unsupervised outlier detection methods, that do not make use of the prior information of "normal only" training data.

You could concatenate training and test files into one file, and then run outlier detection. Most published algorithms in this domain are unsupervised. In unsupervised learning, there is no training data set - there is only one kind of data.

Note that most algorithms available in ELKI as of 2014 are designed for numerical data. If your data is categorial, you will be able to use many of them, but you will need to implement data types and distance functions that fit your data type. There are some parsers and distances for non-numerical data available (e.g. for textual data) but this is not supported by the ARFF parser, and there is currently no distance function for mixed data either.

Upvotes: 3

Related Questions