Reputation: 3303
I am trying to use a XGBoost model in Sage Maker and use it to score for a large data stored in S3 using Batch Transform.
I build the model using existing Sagemaker Container as follows:
estimator = sagemaker.estimator.Estimator(image_name=container,
hyperparameters=hyperparameters,
role=sagemaker.get_execution_role(),
train_instance_count=1,
train_instance_type='ml.m5.2xlarge',
train_volume_size=5, # 5 GB
output_path=output_path,
train_use_spot_instances=True,
train_max_run=300,
train_max_wait=600)
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
The following code is used to do Batch Transform
The location of the test dataset
batch_input = 's3://{}/{}/test/examples'.format(bucket, prefix)
# The location to store the results of the batch transform job
batch_output = 's3://{}/{}/batch-inference'.format(bucket, prefix)
transformer = xgb_model.transformer(instance_count=1, instance_type='ml.m4.xlarge', output_path=batch_output)
transformer.transform(data=batch_input, data_type='S3Prefix', content_type='text/csv', split_type='Line')
transformer.wait()
The above code works fine in Development environment (Jupyter notebook) when the model is built in Jupyter. However, I would like to deploy the model and call its endpoint for Batch Transform.
Most examples for SageMaker endpoint creation is for scoring on a single data and not for batch transform.
Can someone point to how to deploy and use the endpoints for Batch Transform in SageMaker? Thank you
Upvotes: 2
Views: 4722
Reputation: 3303
The following link has an example of how to call a stored model in SageMaker to run Batch Transform job.
Upvotes: 1