kurtosis
kurtosis

Reputation: 1405

Get holdout loss in Vowpal Wabbit

I'm trying to implement grid search or more sophisticated hyperparameter search in Vowpal Wabbit. Is there a relatively simple way to get a loss function value obtained on a validation set (holdout in vw) for this purpose? VW must have computed it e.g. for every number of passes, because early stopping happens depending on it's value.

As yet, I detour this by creating a separate file with validation dataset, saving different models' predictions on this dataset, and comparing their performance in python, thereby incurring unnecessary waste of data. But maybe there is a way to use vw holdout scores explicitly?

Upvotes: 2

Views: 1185

Answers (1)

Martin Popel
Martin Popel

Reputation: 2670

To summarize the comments, there are several ways how to get holdout loss from VW (they can be combined):

  1. With one-pass learning, VW reports progressive validation loss, which (simply said) converges approximately to the same value as holdout loss after enough examples.
  2. With multiple passes, VW reports holdout loss (unless --holdout_off is specified) based on each 10th example (not on random 1/10 of examples). Using --holdout_period one can specify different number than 10.
  3. Parameter --holdout_after=N specifies that first N examples of the input data will be used for training and the rest of the file as holdout set (instead each 10th example).
  4. One can use -p predictions.txt and compute the loss outside of VW (by comparing predictions.txt with the input data with gold labels). When X passes are used, predictions.txt will contain X*number_of_input_data_examples. Thus, it is recommended to train on the training data (possibly with multiple passes), save the model to a file and then use VW only to predict: vw -i trained.model -t -d test.input -p test.predictions.
  5. In some scenarios --save_per_pass or vw --daemon and saving model on demand may be helpful.
  6. For computing both holdout(test) loss and train loss, comfortably from the command line, one can use vw-experiment.

Upvotes: 2

Related Questions