mfalcon
mfalcon

Reputation: 880

Prediction confidence using scikit LinearSVC

I'm using the LinerSVC technique to classify text but I'd like to get a prediction confidence level attached with every prediction.

This is what I have right now:

    train_set = self.read_training_files()
    count_vect = CountVectorizer()
    X_train_counts = count_vect.fit_transform([e[0] for e in train_set])
    tfidf_transformer = TfidfTransformer()
    X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)
    clf = LinearSVC(C=1).fit(X_train_tfidf, [e[1] for e in train_set])
    _ = text_clf.fit([e[0] for e in train_set], [e[1] for e in train_set])
    foods = list(self.get_foods())
    lenfoods = len(foods)
    i = 0
    for food in foods:
        fd = self.get_modified_food(food)
        food_desc = fd['fields']['title'].replace(',', '').lower()
        X_new_counts = count_vect.transform([food_desc])
        X_new_tfidf = tfidf_transformer.transform(X_new_counts)
        predicted = clf.predict(X_new_tfidf)

The variable "predicted" will contain the predicted category number with no confidence level included. I have been reading the source code here but I didn't find a proper attribute to do this.

Upvotes: 2

Views: 2472

Answers (1)

Eiyrioü von Kauyf
Eiyrioü von Kauyf

Reputation: 4725

I think you're looking in the wrong place :). Have you looked at:

the relevant decision function ?


personally for me the docs in sklearn are very helpful; sometimes more so than the code :)

Upvotes: 5

Related Questions