Reputation: 669
sagemaker API let's you create model ( see sample below) , are there any paramters that we can pass , as environment variables that can specify number of workers/cpu for certain instance type that we choose?
def model():
sagemaker.create_model(
ModelName = 'mymodel'
PrimaryContainer = {
'ModelDataUrl': "s3://modelloation",
'Environment': {
}
}
Upvotes: 2
Views: 2274
Reputation: 395
i don't know if it can help but for inference you need to pass the env var like this : 'SAGEMAKER_MODEL_SERVER_WORKERS':'4'
so it can spawn multiple workers
Upvotes: 2
Reputation: 1152
Once you create the model, you have two options to host your model for inference.
If you need real-time inference, you can use the CreateEndpointConfig API to specify the instance types and count, and then create an endpoint using the specified configuration.
If you need batch inference, you can use the TransformResources section to specify the resources (CreateTransformJob API).
Upvotes: 1