Reputation: 56894
I think I "get" the basics of multi-threading with Java. If I'm not mistaken, you take some big job and figure out how you are going to chunk it up into multiple (concurrent) tasks. Then you implement those tasks as either Runnable
s or Callable
s and submit them all to an ExecutorService
. (So, to begin with, if I am mistaken on this much, please start by correcting me!!!)
Second, I have to imagine that the code you implement inside run()
or call()
has to be as "parallelized" as possible, using non-blocking algorithms, etc. And that this is where the hard part is (writing parallel code). Correct? Not correct?
But the real problem I'm still having with Java concurrency (and I guess concurrency in general), and which is the true subject of this question, is:
When is it even appropriate to multi-thread in the first place?
I saw an example from another question on Stack Overflow where the poster proposed creating multiple threads for reading and processing a huge text file (the book Moby Dick), and one answerer commented that multi-threading for the purpose of reading from disk was a terrible idea. Their reasoning for this was because you'd have multiple threads introducing the overhead of context-switching, on top of an already slow process (disk access).
So that got me thinking: what classes of problems are appropriate for multi-threading, what classes of problems should always be serialized? Thanks in advance!
Upvotes: 10
Views: 1181
Reputation: 62439
So that got me thinking: what classes of problems are appropriate for multi-threading, what classes of problems should always be serialized?
Basically CPU-intensive tasks (that do a lot of data processing like in-memory sorting for example) should be parallelized (if possible) and I/O bound tasks should be left sequential (like disk I/O). This is general advice with some exceptions of course.
Upvotes: 5
Reputation: 24847
99.9 % of threads on your box are not doing any CPU-intensive work. My box here has 1084 threads at the moment and 1% CPU use - the 1084 threads are not doing anything significant at all. They are all waiting, many on signals from other threads but, most importantly, many waiting on on I/O. The most important and pervasive reason for using multiple threads on a preemptive multitasking OS is to boost overall I/O performance for an app. These preemptive kernels force us into the pain of synchronization, queues, locks etc - essentially a different design zone where one instruction no longer necessarily follows another. The upside, and it's a huge one, is that I/O performance is massively better than any cooperative scheduling system since any thread waiting on a driver for I/O can be made ready/running 'immediately' on I/O completion by the driver which responds to a hardware interrupt. Async I/O does not change this, it just moves the I/O wait to a kernel thread pool that must list up async requests and make the callback-setup user thread ready when I/O occurs for it, (while forcing user code to revert to explicit state-machines). So 'what classes of problems are appropriate for multi-threading':
1) Anywhere that I/O is expected from multiple sources where completion can occur asynchronously.
2) Where threads make app design easier, quicker and safer. If 20 'things' have to happen concurrently, it's much easier to write apparrently 'in-line' code and run it with 20 threads than develop a state-machine yourself to handle 20 different contexts. Since threads inside a process share memory, it's trivial to communicate huge buffers, (OK, buffer references/pointers), on queues, simplifying layered/pipelined apps, eg. comms stacks.
3) CPU -intensive operations on multicore boxes, expecially where the datasets for each thread/core can be isolated for cacheing optimization.
4) AOB :)
Without multiple threads and the I/O performance from the preemptive kernel, there would be no BitTorrent, no video streaming, no MMP games, no AVI player.
You would, however, be able to run Notepad and MS Word...
Upvotes: 0
Reputation: 86381
Multithreading is valuable to:
Upvotes: 3
Reputation: 717
Concurrency is also very useful in certain algorithms. For instance, I'm currently working on writing a program which will calculate the optimal solution to a complicated problem using a genetic algorithm. In a genetic algorithm, you have a population of individuals who all must execute a fitness function. The execution of these fitness tests are generally going to be completely independent of each other, and there's going to be a lot of them to execute (you could have population sizes in the hundreds, for instance). Parallelization can dramatically increase the speed of a genetic algorithm by cutting down the time it takes to execute all of the fitness functions.
Hopefully this gives you an idea of what people are referring to when they talk about 'cpu intensive' tasks, especially because not all cpu intensive tasks are easily made parallel.
Upvotes: 0
Reputation: 691655
Multi-threading has two main advantages, IMO:
Note: the problem with reading from the same disk from multiple threads is that instead of reading the whole long file sequentially, it would force the disk to switch between various physical locations of the disk at each context switch. Since all the threads are waiting for the disk-reading to finish (they're IO-bound), this makes the reading slower than if a single thread read everything. But once the data is in memory, it would make sense to split the work between threads.
Upvotes: 8
Reputation: 33534
I prefer it this way....
Threading is very important in case of GUI based applications.
In Java the GUI is handled by Event Dispatcher Thread. Its always advisable to keep the UI work on UI thread, and Non-UI work on Non-UI thread. Suppose you press a button and then there goes a http request to some webserver, processing take place on the server, then it responds back with the result.. If you dont create a Non-UI thread to handle this job , then your GUI will be NON-RESPONSIVE until and unless the webserver's respond is received.
Threads are also very important in cases, where multiple works are to be done simultaneously. The best example is OS. Normally i do coding listening to my favourite music, and at the same time surfing the net,etc.... Now this is where multi threading is very handy, if there was only one thread, we would have never imagined doing what we can do with OS today.
Multiple Thread across CPU are used for parallel processing of a CPU intensive work.
In the case of Java Servlet, every request hitting the Server, will be handled by a separate thread provided by the container.
Upvotes: 1
Reputation: 51445
So that got me thinking: what classes of problems are appropriate for multi-threading, what classes of problems should always be serialized?
When you're constructing a GUI using Swing components, sometimes the tasks you want to do by clicking on a button (as an example) take so long, that you would lock the GUI while you're performing the task.
So, you perform the task in a different thread, so you can keep the GUI thread (Swing worker thread) responsive to the Swing components.
Upvotes: 3