Reputation: 21883
I classified some data being split in 2 categories, with LinearDiscriminantAnalysis classifier from sklearn, and it works well, so I did this:
from sklearn.cross_validation import train_test_split
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25) # 25% of the dataset are not used for the training
clf = LDA()
clf.fit(x_train, y_train)
Then I manage to make prediction with it and that is fine here.
But, all that is in an ipython notebook, and I'd like to use the classifier elsewhere. I've seen the possibility of using pickles and joblib, but as I only have 2 groups and 2 features, so I though that I could just get the equation of the boundary line, and then check whether a given point is above or below the line to tell which group it belongs.
From what I understood, this line is orthogonal to the projection line and goes through the mean of the clusters' mean. I think I got the clusters' mean with np.mean(clf.means_, axis=0)
.
But here I'm stuck on how to use all the attributes like clf.coef_
, clf.intercept_
, etc... to find the equation of the projection line.
So, my question is how can I get the boundary line equation given my classifier.
It is also possible that I did not understood LDA properly, and I'd be delighted to have more explanations.
Thanks
Upvotes: 3
Views: 4170
Reputation: 66815
The decision boundary is simply line given with
np.dot(clf.coef_, x) - clf.intercept_ = 0
(up to the sign of intercept, which depending on the implementation may be flipped) as this is where the sign of the decision function flips.
Upvotes: 2