Cybernetic
Cybernetic

Reputation: 13334

Can SageMaker dynamically allocate resources based on the load? (i.e. run 5000 models in parallel with parameter tuning)

Cam Sagemaker dynamically allocate resources based on the load? For example, how easy would it be to run 5000 models in parallel? How does parameter tuning come into play?

Upvotes: 2

Views: 203

Answers (1)

Neil McGuigan
Neil McGuigan

Reputation: 48256

You can certainly tune many models in parallel. You can set "Maximum number of parallel training jobs" for hyper-parameter tuning, and many jobs will run at the same time

Parallel tuning might not be the best idea however. Bayesian learning can be used to speed up tuning, but the next step relies on the previous step, making parallel's usefulness limited for Bayesian tuning. See https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-considerations.html#automatic-model-tuning-parallelism

In terms of training, there is no built-in support for scaling. You can run a single training job on many machines (see "instance count"), or run many training jobs in parallel (by making many API calls), or use something like https://automl.github.io/auto-sklearn/master/, which would run many models in one training job

Upvotes: 1

Related Questions