Reputation: 932
I have run a classification across an image and outputted its corresponding pixel data as a dataset. I also have a dataset of the same type with an inconsistent number of samples called training data. I wish to run an accuracy assessment again the classified pixel data using the training dataset the user creates themselves. I have tried to use python spleen confusion_matrix
and accuracy_score
but my issue is the two datasets (producer, user) are of a different size. Is there an accuracy assessment I can perform to check my results?
Here is the two data sets including their size
Code:
user = pd.read_csv("/Users/chrisradford/Documents/School/Masters/RA/Classifier/Python/Training.csv")
producer = pd.read_csv("/Users/chrisradford/Documents/School/Masters/RA/Classifier/Python/ProducerData.csv")
print("User created training data")
print(user.shape)
print(user.head())
print("producer created data")
print(producer.shape)
print(producer.head())
val = accuracy_score(user, producer)
cnf_matrix = confusion_matrix(producer, user)
print(val)
print(cnf_matrix)
Upvotes: 0
Views: 634
Reputation: 620
According to my knowledge, the best way I found to evaluate image classification accuracy is by K-fold cross validation. One can choose any value of K for cross validation however, i would prefer a value of 10 to make sure the evaluation on the test data is not biased and completely random. So when you compute the cross validation of each Kth fold, you end up getting 4 values of false positives, false negatives, true positives and true negatives. After that, one can build the confusion matrix by taking the average of each these values.
Upvotes: 0