Reputation: 165
i am trying to create realtime data ingestion to clickhouse hosted in ec2. For that my pipeline is
Eventbridge -> KinesisFirehose(destination http endpoint) -> lambda(function url) -> clickhouse HTTP endpoint.
Everything is working as expected in UAT. But the function url provided by lambda is public by default, which will ofcourse pose a security concern in prod. Is there any way to make firehose to lambda's http endpoint secure that the the lambda's http endpoint can only be invoked by firehose and the data cannot leave the aws account.
P.S : If there is any way to improve this pipeline, please post in the comments too. Will be helpfull.
Upvotes: 0
Views: 935
Reputation: 3439
There is a trick with "Transform source records with AWS Lambda" configuration in Firehose stream.
Eventbridge -> KinesisFirehose(any destination e.g. S3) -> drop
|
+-> lambda("Transform source") -> anything you want
You can effectively call the Lambda function with the proper IAM Role, Lambda receives all the data from the Firehose. Your lambda should return Dropped
to all records, see https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html
Upvotes: 0
Reputation: 2060
Based on the question's comments, this answer is mostly about adding authentication to that Lambda URL...
I don't think the Lambda URL will work being called from Firehose. The reason is if you're using IAM authorization (implied due to the security requirement) calling it requires the client to sign the API request. Firehose doesn't support that.
I'm not sure of the reason for Firehose, but I think you can remove that and then either call the Lambda directly from EventBridge, or put API Gateway in between EventBridge and the Lambda.
Calling the Lambda directly might be simpler, but then you lose the flexibility of having a web API. But security is easy, its handled by IAM roles.
API Gateway shouldn't be much more difficult, and I assume your Lambda already handles the payload (since that's what the Lambda function URL sends). That looks like this:
EventBridge -> API Gateway -> Lambda
The API Gateway would need either IAM or Cognito authorization:
client_credential
flow. In EventBridge, you'd set up your target as a "EventBridge API destination" and use an authorization type of "OAuth Client Credentials".You also mention the ClickHouse API, I imagine looking into that would be even simpler, depending on how much logic you have in the Lambda. It looks like they have an interface, so you'd then just need to use the "EventBridge API destination" and send to that. Your EC2 hosts would either need to be publicly accessible, or you might be able to proxy the request through API Gateway or something else.
Upvotes: 1