Reputation: 360
I dont know, where to start for this question, because i learn now the neural networks. I have a big database with sentence > label pairs. For example:
i want take a photo < photo
i go to take a photo < photo
i go to use my camera < photo
i go to eat something < eat
i like my food < eat
If the user write a new sentence, i want check all label accurancy score:
"I go to bed, after i use my camera" < photo: 0.9000 , eat: 0.4000, ...
So the question, where can I start? Tensorflow and scikit learn is looks good, but this documents classificationt dont show the accuracy :\
Upvotes: 2
Views: 68
Reputation: 17015
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import LabelEncoder
from sklearn import metrics
sentences = ["i want take a photo", "i go to take a photo", "i go to use my camera", "i go to eat something", "i like my food"]
labels = ["photo", "photo", "photo", "eat", "eat"]
tfv = TfidfVectorizer()
# Fit TFIDF
tfv.fit(traindata)
X = tfv.transform(traindata)
lbl = LabelEncoder()
y = lbl.fit_transform(labels)
xtrain, xtest, ytrain, ytest = cross_validation.train_test_split(X, y, stratify=y, random_state=42)
clf = LogisitcRegression()
clf.fit(xtrain, ytrain)
predictions = clf.predict(xtest)
print "Accuracy Score = ", metrics.accuracy_score(ytest, predictions)
for new data:
new_sentence = ["this is a new sentence"]
X_Test = tfv.transform(new_sentence)
print clf.predict_proba(X_Test)
Upvotes: 1