Reputation: 3359
I don't know what the issue is. Here is the code:
estimator = sagemaker.estimator.Estimator(
image_uri=image_name,
sagemaker_session=sagemaker_session,
role=role,
train_instance_count=1,
train_instance_type="ml.m5.large",
base_job_name="deepar-stock",
output_path=s3_output_path,
)
hyperparameters = {
"time_freq": "24H",
"epochs": "100",
"early_stopping_patience": "10",
"mini_batch_size": "64",
"learning_rate": "5E-4",
"context_length": str(context_length),
"prediction_length": str(prediction_length),
"likelihood": "gaussian",
}
estimator.set_hyperparameters(**hyperparameters)
%%time
estimator.fit(inputs=f"{s3_data_path}/train/")
And when I try to train the model I get the following error (in its entirety).
------------------------------------------------------------------------
---
UnexpectedStatusException Traceback (most recent call last)
<timed eval> in <module>
/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in fit(self, inputs, wait, logs, job_name, experiment_config)
681 self.jobs.append(self.latest_training_job)
682 if wait:
--> 683 self.latest_training_job.wait(logs=logs)
684
685 def _compilation_job_name(self):
/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in wait(self, logs)
1626 # If logs are requested, call logs_for_jobs.
1627 if logs != "None":
-> 1628 self.sagemaker_session.logs_for_job(self.job_name, wait=True, log_type=logs)
1629 else:
1630 self.sagemaker_session.wait_for_job(self.job_name)
/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in logs_for_job(self, job_name, wait, poll, log_type)
3658
3659 if wait:
-> 3660 self._check_job_status(job_name, description, "TrainingJobStatus")
3661 if dot:
3662 print()
/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in _check_job_status(self, job, desc, status_key_name)
3218 ),
3219 allowed_statuses=["Completed", "Stopped"],
-> 3220 actual_status=status,
3221 )
3222
UnexpectedStatusException: Error for Training job deepar-2021-07-31-22-25-54-110: Failed. Reason: ClientError: Unable to initialize the algorithm. Failed to validate input data configuration. (caused by ValidationError)
Caused by: Additional properties are not allowed ('training' was unexpected)
Failed validating 'additionalProperties' in schema:
{'$schema': 'http://json-schema.org/draft-04/schema#',
'additionalProperties': False,
'anyOf': [{'required': ['train']}, {'required': ['state']}],
'definitions': {'data_channel': {'properties': {'ContentType': {'enum': ['json',
'json.gz',
'parquet',
'auto'],
'type': 'string'},
'RecordWrapperType': {'enum': ['None'],
On instance:
{'training': {'RecordWrapperType': 'None',
'S3DistributionType': 'FullyReplicated',
'TrainingInputMode': 'File'}}
Here it says 'training' was unexpected
. I don't know why it says 'training'
on that last line On instance:
. I don't know how to solve this. I've looked at other pages for help but I can't find a straight answer. I know that my data is structured right. The errors seem to be with the hyperparameters but I don't know that for sure. Please help!
Upvotes: 1
Views: 438
Reputation: 132
All AWS estimators require a dictionary for the data inputs. Simply putting a file path does not work. This is because all AWS estimators (built in and custom) use containers. Each time a model is used, it a new container is built for it. Each container has its own generic file directory system. The training data path inside each container is typically something like opt/ml/data/train. When building the container, it looks for the data to be in the form data = {'train': x, 'test': y}. You need to set these keys and values because the container looks for them and then builds a directory pulling and copying the data from data['train'] to the generic location inside the container associated with training data. Similarly, if you had setup DeepAR for testing, it would copy and save data from data['test'] to a generic location inside the container such as /opt/ml/data/test... a good way to learn this is building custom models using script mode which forces us to understand exactly how to access the default container directory and how to change it.
Upvotes: 0
Reputation: 3359
I just needed to add this line of code and change the following code to look like this.
data_channels = {"train": f"{s3_data_path}/train/"}
estimator.fit(inputs=data_channels)
Upvotes: 2