Arvind
Arvind

Reputation: 6474

java- how to determine the optimum number of threads for a specific type of processing, in different types of servers

I have a java program which goes to some websites, converts the website's HTML into XML, then runs some xquery commands on the XML, finally stores the result into csv, which is then uploaded into Cloud file storage (like Amazon S3).

Now, I want to split the work into multiple threads so that it is done faster-- but how do I determine the number of threads that is optimum for my work?

I want to determine the number of threads that I should allow, for the different types of Amazon EC2 instances... Is there a library or framework that can help me with this?

Or, do I have to manually run the code on an Amazon EC2 instance, and keep changing the number of threads, and measure the time taken?

Specifically, I want to keep a balance between total time taken to process all threads, versus the number of threads that are allowed to run simultaneously... And if I could clearly see this correlation for different servers with different CPU/RAM capacities that would be great...Any advice/guidance would be appreciated...

Upvotes: 2

Views: 4065

Answers (3)

Garrett Hall
Garrett Hall

Reputation: 30022

To find the number of logical cores available you can use:

int processors = Runtime.getRuntime().availableProcessors();

and create a ThreadPool with that many. See also :

Finding Number of Cores in Java

Java: How to scale threads according to cpu cores?

Upvotes: 1

Sean Owen
Sean Owen

Reputation: 66876

The type of work you describe is almost certainly I/O bound -- most of the time is spent waiting for data to be downloaded or uploaded. If so, your goal is simply to make full use of upload / download bandwidth.

If so, the optimal number of threads will be more than the number of physical cores on the machine (which would be the right place to start for a CPU-bound process).

It's hard to say from this info what the optimum number of threads will be as it depends on how much you're downloading and how fast the link is. Try doubling the number of threads until performance starts to suffer.

Upvotes: 4

Nishant
Nishant

Reputation: 55856

I think you should profile your app with single thread using JHAT, MAT, etc... and then decide how many thread based on machine config you want to run. It will give you a general idea of how expensive your thread is. You can then run load test (like 10,000 items queued up against 10 threads) to validate the limits that you came up with, and tune accordingly.

Upvotes: 2

Related Questions