Reputation: 67
I'm using Google Cloud ML Engine to train a model with tensorflow.contrib.learn.Experiment. By default it seems that tensorflow has the master server run the evaluations. I only run evals after the training is complete (min_eval_frequency=0), and my master has a large number of cores and RAM but no GPU (so the eval is very slow relative to the P100 workers). Can I make the eval run on a worker?
Upvotes: 0
Views: 282
Reputation: 8389
When using learn_runner.run
, there is no way to run evaluation on regular workers. Here are a few alternatives:
learn_runner.run
. Instead, you'll have to reproduce that functionality. To wit:Instantiate an instance of RunConfig()
. Inspect the task_type
and invoke Experiment.train
, Experiment.evaluate
, or Experiment.continuous_eval
as necessary.
That said, since the Master is basically just another worker that also does evaluation, is there any reason not to use a GPU on the Master?
Upvotes: 1