Get holdout loss in Vowpal Wabbit

Question

I'm trying to implement grid search or more sophisticated hyperparameter search in Vowpal Wabbit. Is there a relatively simple way to get a loss function value obtained on a validation set (holdout in vw) for this purpose? VW must have computed it e.g. for every number of passes, because early stopping happens depending on it's value.

As yet, I detour this by creating a separate file with validation dataset, saving different models' predictions on this dataset, and comparing their performance in python, thereby incurring unnecessary waste of data. But maybe there is a way to use vw holdout scores explicitly?

Martin Popel · Accepted Answer

To summarize the comments, there are several ways how to get holdout loss from VW (they can be combined):

With one-pass learning, VW reports progressive validation loss, which (simply said) converges approximately to the same value as holdout loss after enough examples.
With multiple passes, VW reports holdout loss (unless --holdout_off is specified) based on each 10th example (not on random 1/10 of examples). Using --holdout_period one can specify different number than 10.
Parameter --holdout_after=N specifies that first N examples of the input data will be used for training and the rest of the file as holdout set (instead each 10th example).
One can use -p predictions.txt and compute the loss outside of VW (by comparing predictions.txt with the input data with gold labels). When X passes are used, predictions.txt will contain X*number_of_input_data_examples. Thus, it is recommended to train on the training data (possibly with multiple passes), save the model to a file and then use VW only to predict: vw -i trained.model -t -d test.input -p test.predictions.
In some scenarios --save_per_pass or vw --daemon and saving model on demand may be helpful.
For computing both holdout(test) loss and train loss, comfortably from the command line, one can use vw-experiment.

Get holdout loss in Vowpal Wabbit

Answers (1)

Related Questions