sat
sat

Reputation: 11

Connecting Cassandra from AWS Lambda

We are checking the feasibility of migrating one of our application to Amazon Web Services (AWS) . We decide to use AWS API Gateway to expose the services and AWS Lambda (java) for back end data processing. The lambda function has to fetch a large amount of data from our database. Currently using Cassandra for data storage, which has been set up with in an EC2 instance and it has no public ip.

Can anyone suggest a way to access Cassandra(EC2) from AWS Lambda using the private Ip ( 10.0.x.x)?

Is it a right choice to use AWS Lambda for large scale applications?

Upvotes: 0

Views: 2167

Answers (1)

Christophe Schmitz
Christophe Schmitz

Reputation: 2996

Since your Cassandra instance is using private IP, you will need to configure your AWS lambda Network to use a VPC. It could be the VPC you are running Cassandra in, or a VPC you create for the purpose of your lambdas, and that you VPC-peer to your cassandra VPC. A few things to note from the documentation :

  • when your lambda runs in a VPC, it doesn't have internet access by default, you will need to configure a NAT for that.
  • There is an additional latency due to the configuration of the ENI (you only pay that penalty on cold start)
  • You need to make sure your lambda has the right permission to manage the ENI, you should use this role: AWSLambdaVPCAccessExecutionRole

Your plan to use API / AWS lambda has at least 3 potential issues which you need to consider carefully:

  • Cost. API gateway per request cost is higher than AWS lambda per request cost. Make sure you are familiar with the cost.
  • cold start. When AWS start an underlying container to execute your lambda, you pay a cold start latency (which get worse when using VPC due to the management of the ENI). If you execute your lambda concurrently, there will be multiple underlying containers. Each of them will have this cold start the first time. AWS tends to keep the underlying containers ready for a warm start, for a few minutes (users report 5 to 40 minutes). You might try to keep your container warm by pinging your aws lambda, obviously if you have multiple container in parallel, it is getting tricky.
  • Cassandra session. You will probably want to avoid creating and destroying your Cassandra session each time you invoke your lambda (costly). I haven't tried yet, but there are reports of keeping the session alive in a warm container, you might want to check this SO answer.

Having say all that, currently the biggest limitation in using AWS lambda is concurrent execution and cold start latency. For data processing, that's usually fine. For user-facing usage, the percentage of slow cold start might affect your user experience.

Upvotes: 3

Related Questions