Arturo Sbr
Arturo Sbr

Reputation: 6323

Use AWS Lambda to execute a jupyter notebook on AWS Sagemaker

I made a classifier in Python that uses a lot of libraries. I have uploaded the model to Amazon S3 as a pickle (my_model.pkl). Ideally, every time someone uploads a file to a specific S3 bucket, it should trigger an AWS Lambda that would load the classifier, return predictions and save a few files on an Amazon S3 bucket.

I want to know if it is possible to use a Lambda to execute a Jupyter Notebook in AWS SageMaker. This way I would not have to worry about the dependencies and would generally make the classification more straight forward.

So, is there a way to use an AWS Lambda to execute a Jupyter Notebook?

Upvotes: 1

Views: 4617

Answers (3)

Dolf Andringa
Dolf Andringa

Reputation: 2170

It totally possible, not an anti-pattern at all. It really depends on your use-case. AWs actually made a great article describing it, which includes a lambda

Upvotes: 0

Prakash Gupta
Prakash Gupta

Reputation: 104

I agree with Olivier. Using Sagemaker for Notebook execution might not be the right tool for the job.

Papermill is the framework to run Jupyter Notebooks in this fashion.

You can consider trying this. This allows you to deploy your Jupyter Notebook directly as serverless cloud function and uses Papermill behind the scene.

Disclaimer: I work for Clouderizer.

Upvotes: 2

Olivier Cruchant
Olivier Cruchant

Reputation: 4037

Scheduling notebook execution is a bit of a SageMaker anti-pattern, because (1) you would need to manage data I/O (training set, trained model) yourself, (2) you would need to manage metadata tracking yourself, (3) you cannot run on distributed hardware and (4) you cannot use Spot. Instead, it is recommended for scheduled task to leverage the various SageMaker long-running, background job APIs: SageMaker Training, SageMaker Processing or SageMaker Batch Transform (in the case of a batch inference).

That being said, if you still want to schedule a notebook to run, you can do it in a variety of ways:

  • in the SageMaker CICD Reinvent 2018 Video, Notebooks are launched as Cloudformation templates, and their execution is automated via a SageMaker lifecycle configuration.
  • AWS released this blog post to document how to launch Notebooks from within Processing jobs

But again, my recommendation for scheduled tasks would be to remove them from Jupyter, turn them into scripts and run them in SageMaker Training

No matter your choices, all those tasks can be launched as API calls from within a Lambda function, as long as the function role has appropriate permissions

Upvotes: 2

Related Questions