Reputation: 301

Using the dev set or the train set

Can someone clear up this doubt of mine.

While evaluating the model, we should try a smaller set. the dev set is a small set. So we try something on the dev set and come to a conclusion and then go to the train set to train it properly and check.

We train the training set and evaluate the model on dev set. with dev set as a benchmark.

Upvotes: 1

Answers (3)

Don Feto

Reputation: 1484

We train our model on the training set and evaluate the model on dev and test sets. In a sense, the purpose of the test set is to make sure that our evaluation of the dev set is correct (we expect both dev and test errors to have close values).

Dev, the test should have the same distribution as If they had different distributions we wouldn’t be able to compare errors and reason about results.

Upvotes: 0

Behzad

Reputation: 109

Your first scenario is correct.

As training a deep neural network may takes enormous time, it is not a good way to first create the final model then (in the end), make an effort to evaluate it whether it is working well or not !!! But what's the solution?! Here is where the dev set (also known as validation set) appears. It is much better to evaluate our model at the same time it is being trained.

To do this, we split our dataset into train data and dev set. Now we can compute two extra features about our model asides from "accuracy" and "loss" in each epoch: dev_accuracy and dev_loss which can be helpful to figure out our model issue. For instance, if accuracy is indicating a high percentage over our training data (say 0.92) but dev_accuracy is equal to 0.3, it obviously means that our model issue is overfitting! (Why?) Cause it is working very well on our train data but cannot have a good prediction about some data which are new (They haven't already been passed to model) to our model.

Upvotes: 1

lejlot

Reputation: 66815

Normally you would have three sets:

train - the one used to do actual training, optimisation over
validation - the one used to evaluate/verify training, make decisions about hyperparameters, early stopping and so on
test - the one used as final benchmarking

For various reasons some of the above might be missing in the setup, but this is the standard approach, and every modification requires good reasons to do that.

Often datasets do not specify "validation", as fitting of hyperprameters etc. is considered part of the training, thus every data point used for that, is de facto used to train your model (thus a part of "train" dataset). In practise, this means that you have to split train set on your own, into "proper train" and "validation" (if method being used requires fitting some additional hyperparameters).

Upvotes: 3

Using the dev set or the train set

Answers (3)

Related Questions