Metric Document is too large in Azure ML Service

Question

I am trying to save metrics : loss, validation loss and mAP at every epoch during 100 and 50 epochs but at the end of the experiment I have this error: Run failed: RunHistory finalization failed: ServiceException: Code: 400 Message: (ValidationError) Metric Document is too large

I am using this code to save the metrics

run.log_list("loss", history.history["loss"])
run.log_list("val_loss", history.history["val_loss"])
run.log_list("val_mean_average_precision", history.history["val_mean_average_precision"])

I don't understand why trying to save only 3 metrics exceeds the limits of Azure ML Service.

Roope Astala - MSFT · Accepted Answer

You could break the run history list writes into smaller blocks like this:

run.log_list("loss", history.history["loss"][:N])
run.log_list("loss", history.history["loss"][N:])

Internally, the run history service concatenates the blocks with same metric name into a contiguous list.

Metric Document is too large in Azure ML Service

Answers (1)

Related Questions