Thilo
Thilo

Reputation: 25

Test accuracy vs Training time on Weka

From what I know, test accuracy should increase when training time increase(up to some point); but experimenting with weka yielded the opposite. I am wondering if misunderstood someting. I used diabetes.arff for classification with 70% for training and 30% for testing. I used MultilayerPerceptron classifier and tried training times 100,500,1000,3000,5000. Here are my results,

   Training time   Accuracy  
   100             75.2174 %
   500             75.2174 %
   1000            74.7826 %
   3000            72.6087 %
   5000            70.4348 %
   10000           68.6957 % 

What can be the reason for this? Thank you!

Upvotes: 1

Views: 1410

Answers (1)

baddog
baddog

Reputation: 480

You got a very nice example of overfitting.

Here is the short explanation of what happened:

You model (doesn't matter whether this is multilayer perceptron, decision trees or literally anything else) can fit the training data in two ways.

First one is a generalization - model tries to find patterns and trends and use them to make predictions. The second one is remembering the exact data points from the training dataset.

Imagine the computer vision task: classify images into two categories – humans vs trucks. The good model will find common features that are present in human pictures but not in the trucks pictures (smooth curves, skin-color surfaces). This is a generalization. Such model will be able to handle new pictures pretty well. The bad model, overfitted one, will just remember exact images, exact pixels of the training dataset and will have no idea what to do with new images on the test set.

What can you do to prevent overfitting?

There are few common approaches to deal with overfitting:

  1. Use simpler models. With fewer parameters, it will be difficult for a model to remember the dataset
  2. Use regularization. Constrain the weights of the model and/or use dropout in your perceptron.
  3. Stop the training process. Split your training data once more, so you will have three parts of the data: training, dev, and test. Then train your model using training data only and stop the training when the error on the dev set stopped decreasing.

The good starting point to read about overfitting is Wikipedia: https://en.wikipedia.org/wiki/Overfitting

Upvotes: 1

Related Questions