Reputation: 125
I have multiple CSV files containing data for different tables, with different file sizes varying from 1 MB to 1.5 GB. I want to process the data (replace/remove values of columns) row by row and then load the data to existing Redshift tables. This is once a day batch processing.
python, psycopg lib
etc), leads to more cost.I need a inputs on which service is best suitable, optimized solution for such use-case? Or It would also be great if anyone suggests a better way to use above services I mentioned in better way.
Upvotes: 0
Views: 1662
Reputation: 379
Batch is best suited for your use case. I see that your concern about batch is about the development and unit testing on your personal desktop. You can automate that process using AWS ECR, CodePipeline, CodeCommit and CodeBuild. Setup a pipeline to detect changes made to your code repo, build the image and push it to ECR. Batch can pick up the latest image from there.
Upvotes: 0