Shripad Bharat
Shripad Bharat

Reputation: 41

Create AWS sagemaker endpoint and delete the same using AWS lambda

Is there a way to create sagemaker endpoint using AWS lambda ?

The maximum timeout limit for lambda is 300 seconds while my existing model takes 5-6 mins to host ?

Upvotes: 4

Views: 2242

Answers (2)

raj
raj

Reputation: 1213

One way is to combine Lambda and Step functions with a wait state to create sagemaker endpoint

In the step function have tasks to

1 . Launch AWS Lambda to CreateEndpoint

import time
import boto3

client = boto3.client('sagemaker')

endpoint_name = 'DEMO-imageclassification-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
endpoint_config_name = 'DEMO-imageclassification-epc--2018-06-18-17-02-44'
print(endpoint_name)

def lambda_handler(event, context):
    create_endpoint_response = client.create_endpoint(
        EndpointName=endpoint_name,
        EndpointConfigName=endpoint_config_name)
    print(create_endpoint_response['EndpointArn'])
    print('EndpointArn = {}'.format(create_endpoint_response['EndpointArn']))

    # get the status of the endpoint
    response = client.describe_endpoint(EndpointName=endpoint_name)
    status = response['EndpointStatus']
    print('EndpointStatus = {}'.format(status))
    return status

2 . Wait task to wait for X minutes

3 . Another task with Lambda to check EndpointStatus and depending on EndpointStatus (OutOfService | Creating | Updating | RollingBack | InService | Deleting | Failed) either stop the job or continue polling

import time
import boto3

client = boto3.client('sagemaker')

endpoint_name = 'DEMO-imageclassification-2018-07-20-18-52-30'
endpoint_config_name = 'DEMO-imageclassification-epc--2018-06-18-17-02-44'
print(endpoint_name)

def lambda_handler(event, context):
    # print the status of the endpoint
    endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)
    status = endpoint_response['EndpointStatus']
    print('Endpoint creation ended with EndpointStatus = {}'.format(status))

    if status != 'InService':
        raise Exception('Endpoint creation failed.')


    # wait until the status has changed
    client.get_waiter('endpoint_in_service').wait(EndpointName=endpoint_name)


    # print the status of the endpoint
    endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)
    status = endpoint_response['EndpointStatus']
    print('Endpoint creation ended with EndpointStatus = {}'.format(status))

    if status != 'InService':
        raise Exception('Endpoint creation failed.')

    status = endpoint_response['EndpointStatus']
  return 

Another approach is to combination of AWS Lambda functions and CloudWatch rules which I think would be clumsy.

Upvotes: 2

dennis-w
dennis-w

Reputation: 2156

While rajesh answer is closer to what the question ask for, I like to add that sagemaker now has a batch transform job.

Instead of continously hosting a machine, this job can handle predicting large size of batches at once without caring about latency. So if the intention behind the question is to deploy the model for a short time to predict on a fix amount of batches. This might be the better approach.

Upvotes: 0

Related Questions