Reputation: 21596
I am using Spring batch to download a big file in order to process it. the scenario is pretty simple:
1. Download the file via http
2. process it(validations,transformations)
3. send it into queue
I am looking for best practice to handle this situation.
Should I create Tasklet to download the file locally and than start processing it via regular steps?
in that case I need to consider some temp-file-concerns
(make sure I delete it, make sure i am not overriding other temp filename, etc..)
In other hand I could download it and keep it in-memory but than I afraid that if I run many jobs instances ill be out of memory very soon.
How would you suggest to nail this scenario ?? Should I use tasklet at all?
thank you.
Upvotes: 1
Views: 5157
Reputation: 21463
If you have a large file, I'd recommend storing it to disk unless there is a good reason not to. By saving the file to disk it allows you to restart the job without the need to re-download the file if an error occurs.
With regards to the Tasklet
vs Spring Integration, we typically recommend Spring Integration for this type of functionality since FTP functionality is already available there. That being said, Spring XD uses a Tasklet
for FTP functionality so it's not uncommon to take that approach as well.
A good video to watch about the integration of Spring Batch and Spring Integration is the talk Gunnar Hillert and I gave at SpringOne2GX. You can find the entire video here: https://www.youtube.com/watch?v=8tiqeV07XlI. The section that talks about using Spring Batch Integration for FTP before Spring Batch is at about 29:37.
Upvotes: 3
Reputation: 2154
I believe below example is a classic solution to your problem http://docs.spring.io/spring-batch/trunk/reference/html/springBatchIntegration.html#launching-batch-jobs-through-messages
Upvotes: 1