user21185672
user21185672

Reputation:

aws sqs to s3 using lambda

Our upstream system is sending JSON messages to our SQS we will have 5 million messages per day. I need to persist these messages to a S3 bucket for archiving and analytics purpose. I need to dequeue the messages and write 100K messages to a S3 file using lambda function. we will have multiple small files created in S3 buckets to Facilitate quick processing. The lambda would be triggered few times in a day . Any sample code for the lambda function that i can use or any pointers would be appreciated.

Upvotes: 0

Views: 879

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 269081

Processing millions of objects in Amazon S3 is not advisable.

Software or services that attempt to use these objects will be very slow. For example, simply listing the contents of an Amazon S3 bucket can only return 1000 objects per API call. Even services such as Amazon Athena that process multiple files in parallel will be very slow in listing and reading that many objects.

An alternative approach would be to send the messages to an Amazon Kinesis Data Firehose, which can combine multiple messages together based on size or elapsed time. It can then store files that combine multiple messages in one, thereby reducing the number of objects created in the S3 bucket.

If you are dealing with 100K+ objects in Amazon S3, also consider using Amazon S3 Inventory, which can provide a daily or weekly CSV file listing all objects.

Upvotes: 3

Related Questions