Frank B.
Frank B.

Reputation: 315

Difference between fitting a DNN with a validation set and without in TensorFlow

I'm building a pattern recognition Deep Neural Network using TensorFlow in Python 3.5. After building my net and creating my training set, I train my model using the following function in TensorFlow:

model = tflearn.DNN(net, tensorboard_dir='tflearn_logs')
model.fit(train_x, train_y, n_epoch=2000, batch_size=8, show_metric=True)
model.save(name + '.tflearn')

And it works quite good when I make predictions on input it never saw. Reading Tflearn documentation on fit function, it says that I can pass to that function a "validation_set", as the name suggests it is a set used to validate my model.

What is the difference between passing a validation set and not passing it?

Upvotes: 2

Views: 1743

Answers (2)

Mehran
Mehran

Reputation: 60

In machine learning, you split your data into three parts: 1- train, 2- validation, 3- test. Then you try a bunch of different hyper-parameters (e.g., the number of epochs in your case, or the number of layers in the network, ...) by learning a model on the train data and measuring the performance on validation data. Then you pick the model trained with the best hyper-parameters (according to the performance on the validation set), and measure its performance on your test data, giving you the accuracy of your model. If you don't use a validation set and tune your hyper-parameters on your test set, that's considered cheating because you may overfit to your test set (i.e. have a model which specializes in making good predictions for your test set).

Upvotes: 0

Lan
Lan

Reputation: 6640

Actually IMHO, the validation set naming is quite confusing. Usually, in machine learning or deep learning, validation applies to a dataset that is used for hyperparameters tuning, such as the layer of your DNN or the number of neurons of each layer or the lambda value of regularization. So it should be named as test_set.

But anyway, you have two ways to specify the validation set in tflearn. One is to pass it in a a float number between 1 and 0

model.fit(X, Y, validation_set=0.1)

That means that the fit method will use 10% of the training data to test the performance of your model and only use 90% of the orginal training dataset for training purpose.

Or you can split the dataset yourself into training dataset and validation/test dataset yourself and pass them in as below:

model.fit(X_train, Y_train, validation_set=(X_test, Y_test))

With the validation set, you can then say with confidence that what is the accuracy of your model for "unseen" data, instead of using a statement such as "it works quite good when I make predictions on input it never saw.". Also if you find that accurancy of model on training data is much higher than on the validation dataset, you know you have an overfitting problem and can apply techniques to address it.

Upvotes: 6

Related Questions