Reputation: 6550
I have a MySQL database with a large number of rows.
I want to initialize multiple Threads (each with its own database connection) in Java and read/print the data simultaneously.
How to partition data between multiple threads so as no two Threads read the same record? What strategies can be used?
Upvotes: 2
Views: 2117
Reputation: 17923
If the large dataset has an integer primary key, then one of the approaches would be as follows
Note: the following issues with this approach
This approach is simple and makes sure that a row is strictly processed by only thread.
Upvotes: 1
Reputation: 32537
It depends on what kind of work are your threads going to do. For example i usually execute single SELECT for some kind of large dataset, add tasks to thread safe task queue and submit workers which picks up proper task from queue to process. I usually write to DB without synchronisation, but that depends on size of unit of work, and DB constrains (like unique keys etc). Works like charm. Other method would be to just simply run multiple threads and let them work on their own. I strongly disadvice usage of some fancy LIMIT, OFFSET however. It still requires DB to fetch MORE data rows than it will actually return from query.
EDIT: As you have added comment that you have same data, than yes, my solution is what are you looking for
Upvotes: 2
Reputation: 10099
You can use a singleton class to maintain already read rows. So every thread can access the row number from that singleton.
Otherwise you can use static AtomicInteger variable from a common class. Every time threads will call getAndIncrement method. So you can partition data between the threads.
Upvotes: 0