How do you implement rate limiting on a serverless lambda application?

Currently I have a serverless API using lambda and API gateway.

The next feature I want to build is user authentication using AWS cognito and then apply rate limiting to each user.

how would I go about doing this? can API gateway communicate with cogito?

Per-client throttling limits are applied to clients that use API keys associated with your usage policy as client identifier.

However as far as I understand this is referring to rate limiting per x-api-key which is used to invoke the lambda.

I don't really want to have to create a new one of these keys for every user as there is a hard limit of 10000 issued at one time. I would much rather use cognito user pool keys.

I know an alternative approach would be to, build a custom authorizer which would write user IDs to an in memory database such as Redis or ElastiCache, this would then be queried on every request to calculate the last time that user made a request.

However, I don't really like this approach as if it won't be as scalable as the serverless API and may pose as a bottleneck for the entire API.

How is everyone implementing rate limiting like this? have I missed something fundamental? does amazon have an out of the box solution I can use?

Upvotes: 3

Answers (3)

Jeff S.

Reputation: 1311

We're currently trying to work around this exact same issue as we're about to hit the 10,000 keys hard limit. I'm not sure why this isn't unlimited to support developer SaaS type businesses where you might have millions of users. Anyway...

We are using a custom authorizer to verify users and have tried this library https://github.com/blackflux/lambda-rate-limiter. It seems to work OK and uses in-memory storage to keep track of requests. This does mean that the Lambda has to be kept warm and in theory a user could be tracked across multiple Lambdas, but it could be reliable enough.

Also had a look at this one: https://github.com/animir/node-rate-limiter-flexible, which has a bunch of storage adapters but unfortunately no DynamoDB which we'd prefer. If we outgrow the Lambda solution this could be a next step and we create our own DDB adapter or go with Redis.

The disadvantage of using your own rate limiting is that, as far as I can tell you have to turn off the custom authorizer caching, which is a nice feature and means your authorizer is not run on every request (you basically have to run 2 Lambdas on every request). So we are weighing up the pros and cons of this vs rate limiting.

We are also looking at a WAF from a security perspective but that won't rate limit individual users.

I can't provide a definitive answer but these are the options we are looking at so far.

Upvotes: 0

Daniel Seichter

Reputation: 929

Writing a Custom Authorizer would be one of the most common best practices. This is also no bottlenet, because this Custom Authorizer would be itself a Lambda which dynamically scale up.

Iterate the development of your Custom Authorizer:

1. Lambda + Cognito

If you really prefer Cognito, than connect your Lambda to Cognito and request the user information from the event of the Lambda. Check if the user exists in Cognito and probably check, if the user is a member of a specific group. This could be done using the SDK, e.g. boto3 for Python. To reduce the API request to Cognito, use the Caching within API Gateway. Set this to e.g. 300 seconds. So only each five minutes the Cognito API will be requested.

2. Optimize Performance

If iteration 1 let's you conclude to this process will slow you down, than use the smallest redis instance and store the information you need to redis. Set the expire time to 30 minutes. Change the behaviour of the Custom Authorizer to check first, if there is an entry in redis available. If yes, use this information if the values are not expired. If not, call the Cognito API and store the result also in redis. Now you have system which will only call all 5 minutes to redis (because of Caching in API Gateway) and all 30 minutes the Cognito API, because of the expiration time of the objects within redis

3. Optimize stability

If all is working, you can work in the stability. E.g. check if redis is available (redis ping)...if not, do not throw errors, just proceed like you do not have redis. If Cognito won't be reachable, extend the expired object to be sure, users are able to login while you solving the Cognito issues.

4. optional: enrich your data

you can enrich the data, like further checks by adding a DynamoDB and request those data. Store those information also in redis, will reduce the duration of the execution time of the Custom Authorizer Lambda itself.

Summarize

All in all, using Cache in API Gateway, Redis to reduce requests to a user backend (like Cognito) and reduce single point of failures in your dependencies (like check with redis ping) you can have a Custom Authorier which will check within less then 100ms, if a user is able to login/call an API or not.

Out of the box, you only have API Key for "userless" authorization. The limit of 10.000 is increaseable, but you also have to implement a service to attach a Key to a user... this would be possible by using triggers in Cognito User Pool (if user will be created, create new API Key and send it to the user using SES)...but you have to write further triggers/functions for use cases like remove a user, etc...

Upvotes: 1

Vikas Bansal

Reputation: 121

AWS API gateway is more suited for client credentials oAuth authentication flow for point to point connectivity. It don't provide much features such as rate limiting based on users. You can use lambda authoriser with dynamodb to store user limits and current value and provide rate limiting based on user. There is no feature provided by AWS API gateway for user based limiting.

Upvotes: 2