Shriram
Shriram

Reputation: 103

Multithreaded batch processing to write to and read from database

I am supposed to design a component which should achieve the following tasks by using multiThreading in Java as the files are huge/multiple and the task has to happen in a very short window:

  1. Read multiple csv/xml files and save all the data in database
  2. Read the database and write the data in separate files csv & xmls as per the txn types. (Each file may contain different types of records life file header, batch header, batchfooter, file footer, different transactions, and checksum record)

I am very new to multithreading & doing some research on Spring Batch in order to use it for the above tasks.

Please let me know what you suggest to use traditional multithreading in Java or Spring Batch. The input sources are multiple here and output sources are also multiple.

Upvotes: 3

Views: 6967

Answers (2)

gkamal
gkamal

Reputation: 21000

Spring Batch is ideal to implement your requirement. First of all you can use the builtin readers and writers to simplify your implementation - there is very good support for parsing CSV files, XML files, reading from database via JDBC etc. You also get the benefit of features like retrying in case of failure, skipping input that is invalid, restarting the whole job if something fails in between - the framework will track the status & restart from where it left off. Implementing all this by yourself is very complex and doing it well requires a lot of effort.

Once you implement your batch jobs with spring batch it gives you simple ways of parallelizing it. A single step can be run in multiple threads - it is mostly a configuration change. If you have multiple steps to be performed you can configure that as well. There is also support for distributing the processing over multiple machines if required. Most of the work to achieve parallelism is done by Spring Batch.

I would strongly suggest that you prototype a couple of your most complex scenarios with Spring Batch. If that works out you can go ahead with Spring Batch. Implementing it on your own especially when you are new to multi threading is a sure recipe for disaster.

Upvotes: 3

toske
toske

Reputation: 1754

I would recommend going with something from framework rather than writing whole threading part yourself. I've quite successfully used Sping's tasks and scheduling for scheduled tasks that involved reaching data from DB, doing some processing, and sending emails, writing data back to database).

Upvotes: 4

Related Questions