Reputation: 1589
I'm following this blog post :
https://yashk2810.github.io/Applying-Convolutional-Neural-Network-on-the-MNIST-dataset/
and using the model there to work with my data.
The keras model used is :
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(28,28,1)))
model.add(BatchNormalization(axis=-1))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(BatchNormalization(axis=-1))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64,(3, 3)))
model.add(BatchNormalization(axis=-1))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(BatchNormalization(axis=-1))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
# Fully connected layer
model.add(Dense(512))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))
Say above is Model A, having batch Normalization, and Model B is replica of Model A but without Batch Normalization.
I have 15 test files for which I'm calculating precision recall accuracy etc.
Currently I'm training Model A noting down results that appear on the terminal, and then training Model B and noting those results and comparing.
Is there a way to automatically save the results from both models to a tabular format so that I can easily compare how the various metrics differ in both cases?
is it not very important but , The structure I have in mind is something like an R dataframe
Filename | Model | Metric1 | Metric2
a A 90% 80%
b A 60% 90%
a B 70% 81%
Thank you.
PS : just to be clear, I know that I can save results from each run to a list/dict and then reshape it as I want.
My question, If i declare 3-4 models, how do I compare the performance automatically?
Upvotes: 1
Views: 645
Reputation: 1593
I actually don't know how to compare the models automatically, but one usually uses the ROC curves to compare classifiers. I suggest you to read Fawcett's paper on ROC analysis. You can request it on ResearchGate.
With binary classifiers, one calculates the true-positive-rate and the false-positive-rate for all possible thresholds and plots the first on the y-axis and the second on the x-axis. The resulting curve per classifier can be integrated and the resulting integral, the so called "area under the curve", is equal to the probability that the classifier rankes a randomly chosen positive sample higher than a randomly chosen negative one. This value can be used to compare classifiers because a higher value shows a overall better performance than a lower one. Fawcett also gives a method to apply this to multi-class classification.
Upvotes: 1