Nick Dragosh
Nick Dragosh

Reputation: 515

Sagemaker Hyperparameter Optimization XGBoost

I am trying to build a hyperparameter optimization job in Amazon Sagemaker, in python, but something is not working. Here is what I have:

sess = sagemaker.Session()

xgb = sagemaker.estimator.Estimator(containers[boto3.Session().region_name],
                                    role, 
                                    train_instance_count=1, 
                                    train_instance_type='ml.m4.4xlarge',
                                    output_path=output_path_1,
                                    base_job_name='HPO-xgb',
                                    sagemaker_session=sess)

from sagemaker.tuner import HyperparameterTuner, IntegerParameter, CategoricalParameter, ContinuousParameter    

hyperparameter_ranges = {'eta': ContinuousParameter(0.01, 0.2),
                         'num_rounds': ContinuousParameter(100, 500),
                         'num_class':  4,
                         'max_depth': IntegerParameter(3, 9),
                         'gamma': IntegerParameter(0, 5),
                         'min_child_weight': IntegerParameter(2, 6),
                         'subsample': ContinuousParameter(0.5, 0.9),
                         'colsample_bytree': ContinuousParameter(0.5, 0.9)}

objective_metric_name = 'validation:mlogloss'
objective_type='minimize'
metric_definitions = [{'Name': 'validation-mlogloss',
                       'Regex': 'validation-mlogloss=([0-9\\.]+)'}]

tuner = HyperparameterTuner(xgb,
                            objective_metric_name,
                            objective_type,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=9,
                            max_parallel_jobs=3)

tuner.fit({'train': s3_input_train, 'validation': s3_input_validation}) 

And the error I get is:

AttributeError: 'str' object has no attribute 'keys'

The error seems to come from the tuner.py file:

----> 1 tuner.fit({'train': s3_input_train, 'validation': s3_input_validation})

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in fit(self, inputs, job_name, **kwargs)
    144             self.estimator._prepare_for_training(job_name)
    145 
--> 146         self._prepare_for_training(job_name=job_name)
    147         self.latest_tuning_job = _TuningJob.start_new(self, inputs)
    148 

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in _prepare_for_training(self, job_name)
    120 
    121         self.static_hyperparameters = {to_str(k): to_str(v) for (k, v) in self.estimator.hyperparameters().items()}
--> 122         for hyperparameter_name in self._hyperparameter_ranges.keys():
    123             self.static_hyperparameters.pop(hyperparameter_name, None)
    124 

AttributeError: 'list' object has no attribute 'keys'                           

Upvotes: 3

Views: 1896

Answers (1)

Farhan
Farhan

Reputation: 439

Your arguments when initializing the HyperparameterTuner object are in the wrong order. The constructor has the following signature:

HyperparameterTuner(estimator, 
                    objective_metric_name, 
                    hyperparameter_ranges, 
                    metric_definitions=None, 
                    strategy='Bayesian', 
                    objective_type='Maximize', 
                    max_jobs=1, 
                    max_parallel_jobs=1, 
                    tags=None, 
                    base_tuning_job_name=None)

so in this case, your objective_type is in the wrong position. See the docs for more details.

Upvotes: 5

Related Questions