L Xandor
L Xandor

Reputation: 1851

Setting max run time for a SageMaker HyperParameter Tuning job

My training jobs only run for a minute or two so I have increased the resource limit so I can run a large number (500) in parallel. However, I would like to set some upper bound so I don't accidentally have them run for several hours times 500....

From the documentation I can find the following

Maximum run time for a hyperparameter tuning job: 30 days

30 days is def too much lol but how can I change it? Would love to just be able to set it to stop if it hits a maximum total training time, but unlike the other limits there's no mention that this can changed.

Upvotes: 0

Views: 749

Answers (1)

Gili Nachum
Gili Nachum

Reputation: 5578

While there's no Tuner parameter that limits the tuner job duration, you could set an effective $ spend limit using the Tuner's max_jobs parameter:

allowed_spend_usd = 50 # 50$
instance_cost_usd_hr = 0.1
total_train_minutes_allowed = allowed_spend_usd * 60 / instance_cost_usd_hr
minutes_per_job = 2 # you know this empirically 
max_jobs = round(total_train_minutes_allowed / minutes_per_job)
###
tuner = HyperparameterTuner(max_jobs=max_jobs, ...)

I recommend that you also set a reasonable max_run per training job to further ensure that training jobs will finish as fast as you expect (say 300 seconds if you expect 60-120 seconds).

Upvotes: 1

Related Questions