Pipeline for parsing daily json data with AWS?

Question

json files are posted daily to an s3 bucket. I want to take that json file, do some processing on it, then post the data to a new s3 bucket where it will get picked up and stored in Redshift. What would be the recommended AWS pipeline for this? AWS lambda that triggers when a new json file is placed on s3, that then kicks off something like an AWS batch job? Or something else? I am not familiar with all AWS web services so might be overlooking something obvious.

So the flow looks like this:

s3 bucket -> data processing -> s3 bucket -> redshift

and it's the data processing step I'm not sure about - how to schedule something fairly scalable that runs daily and efficiently and puts the data back. The processing is parsing of json data and some aggregation and data clean up.

Pipeline for parsing daily json data with AWS?

Answers (1)

Related Questions