Reputation: 35726
I'm in the process of novelty detection using machine-learning. I have tried using one-class svm in scikit learn.
from sklearn import svm
train_data = [[0, 0, 0, 0, 0, 1, 0, 0], [0, 1, 0, 0, 0, 1, 0, 0], [0, 1, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 1], [0, 3, 0, 0, 0, 1, 0, 0], [0, 11, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 4]]
test_data = [[0, 0, 0, 0, 0, 1, 0, 0], [0, 1, 0, 0, 0, 1, 0, 0]]
clf = svm.OneClassSVM(nu=0.1, kernel="rbf", gamma=0.1)
clf.fit(train_data)
pred_test = clf.predict(test_data)
I'm new to this area and I want to know how can I say there is novelty in my test data?
Upvotes: 0
Views: 5426
Reputation: 182
check = clf.predict(test_data)
if check = 1 then not anomaly and
if check = -1 then it an anomaly i.e. data is outlier
Upvotes: 1
Reputation: 116
The inliers are labeled 1, and the outliers (i.e., the novelties in your case) are labeled -1 (as the result of the predict
function).
Please notice that the current documentation incorrectly states that the outliers are labeled 1 & inliers are labeled 0. Please check out the latest updates on github repo for the correct information.
Upvotes: 2