Reputation: 567
Context of My question:
I use a proprietary Database (target database) and I can not reveal the name of the DB (you may not know even If I reveal the name).
Here, I usually need to update the records using java. (The number of records vary from 20000 to 40000) Each update transaction is taking one or two seconds for this DB. So, you see that the execution time would be in hours. There are no Batch execution functions are available for this Database API. For this, I am thinking to use Java multi-threaded feature, instead of executing all the records in single process I want to create a thread for every 100 records. We know that Java can make these threads run parallelly.
But, I want to know how does the DB process these threads sharing the same connection? I can find this by running a trail program and compare time intervals. I feel that it may be deceiving to some extent. I know that you don't have much information about the database. You can just answer this question assuming the DB as MS SQL/MySQL.
Please suggest me if there is any other feature in java I can utilize to make this program execute faster if not multi-threading.
Upvotes: 0
Views: 1005
Reputation: 159
It is not recommended to use single connection with multiple threads, you can read the pitfalls of doing so here.
If you really need to use a single connection with multiple threads, then I would suggest making sure threads start and stop successfully within a transaction. If one of them fails you have to make sure to rollback the changes. So, first get the count, make cursor ranges and for each range start a thread that will execute that on that range. One thing to look for is to not close the connection after executing the partitions individually, but to close it when the transaction is complete and the db is committed.
If you have an option to use Spring Framework, check out Spring Batch.
Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management. It also provides more advanced technical services and features that will enable extremely high-volume and high performance batch jobs through optimization and partitioning techniques. Simple as well as complex, high-volume batch jobs can leverage the framework in a highly scalable manner to process significant volumes of information.
Hope this helps.
Upvotes: 1