Visuddha Karunaratne
Visuddha Karunaratne

Reputation: 536

Buffer Incoming Data and put into S3

For a specific business requirement I need to batch up 5 minutes worth of data from AWS IOT to S3 and process the data.

  1. I tried the firehose approach where I put data into firehose bucket and buffer it for 5 minutes (possible up to 900s). However this is only possible for limited volume of incoming records since once the size threshold (128MB) get satisfied firehose will not wait for 5 minutes to write data into S3. Hence this is not scalable.

What are other ways to achieve this in AWS?

appreciate your input.

Upvotes: 1

Views: 1235

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 270154

Amazon Kinesis Data Firehose is convenient because it can accept an incoming stream of data and save it to Amazon S3. You are correct that the maximum buffer is 900 seconds and 128MB.

See: Amazon Kinesis Data Firehose Limits

It sounds like you are not happy with such limits and you would like a single file after 5 minutes, regardless of filesize. To accomplish this you would need to use a normal Amazon Kinesis Stream with your own consumer reading data from the stream. This is a fairly complex process and involves having Amazon EC2 instance(s) reading the data and copying it to S3.

It would be much easier to use Amazon Kinesis Data Firehose. Perhaps one option is to have Firehose output files at its normal limit intervals, but use something else to trigger the processing (or whatever you wish to do) every 5 minutes.

Upvotes: 2

Related Questions