Reputation: 861
I want to process a large csv (millions of lines) with a Java application on AWS, and write the results in another csv.
The application is packaged in a single jar and can be run with some shell command java -jar myJar.jar -option1 -option2
.
The application could be called anytime depending on a user uploading a csv, triggering the application.
Problem : It works for small files, but lambda functions are limited in execution time, RAM, CPU and temp file storage. They actually are made for short processes.
Problem : Having a cluster running, even when idle, means paying for it.
Is there a solution to run this jar without having coding its equivalent in a custom AWS technology?
EDIT : To answer the comments
Upvotes: 1
Views: 491
Reputation: 5220
There are multiple part where you can make it work more efficiently and saves money.
Require coding:
Less coding required:
Note that the biggest Lambda is quite powerful, at the moment, 3000MB RAM with equivalent CPU power and it gives you 15 minutes to do a task. To keep one T2.Medium ( 4Gb RAM, 2vCore) running 24/7 a month would cost you ~ $38
Or Both:
Upvotes: 1