Reputation: 13334
Cam Sagemaker dynamically allocate resources based on the load? For example, how easy would it be to run 5000 models in parallel? How does parameter tuning come into play?
Upvotes: 2
Views: 203
Reputation: 48256
You can certainly tune many models in parallel. You can set "Maximum number of parallel training jobs" for hyper-parameter tuning, and many jobs will run at the same time
Parallel tuning might not be the best idea however. Bayesian learning can be used to speed up tuning, but the next step relies on the previous step, making parallel's usefulness limited for Bayesian tuning. See https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-considerations.html#automatic-model-tuning-parallelism
In terms of training, there is no built-in support for scaling. You can run a single training job on many machines (see "instance count"), or run many training jobs in parallel (by making many API calls), or use something like https://automl.github.io/auto-sklearn/master/, which would run many models in one training job
Upvotes: 1