Reputation: 2833
For what reasons would one choose several processes over several threads to implement an application in Java?
I'm refactoring an older java application which is currently divided into several smaller applications (processes) running on the same multi-core machine, communicating which each other via sockets.
I personally think this should be done using threads rather than processes, but what arguments would defend the original design?
Upvotes: 8
Views: 3523
Reputation: 49230
I (and others, see attributions below) can think of a couple of reasons:
Historical Reasons
Robustness and Fault Tolerance
You use components which are not thread safe, so you cannot parallelize withough resorting to multiple processes.
Some components are buggy and you don't want them to be able to affect more than one process. Say, if a component has a memory or resource leak which eventually could force a process restart, then only the process using the component is affected.
Correct multithreading is still hard to do. Depending on your design harder than multiprocessing. The later, however, is arguably also not too easy.
You can have a model where you have a watchdog process that can actively monitor (and eventually restart) crashed worker processes. This may also include suspend/resume of processes, which is not safe with threads (thanks to @Jayan for pointing out).
OS Resource Limits & Governance
If the process, using a single thread, is already using all of the available address space (e.g. for 32bit apps on Windows 2GB), you might need to distribute work amongst processes.
Limiting the use of resources (CPU, memory, etc.) is typically only possible on a per process basis (for example on Windows you could create "job" objects, which require a separate process).
Security Considerations
Compatibility Issues
Location Transparency
Upvotes: 10
Reputation: 17444
If you decide to go with threads you will restrict your app to be run on a single machine. This solution doesn't scale (or scales to some extent) - there are always hardware limits.
And different processes communicating via sockets can be distributed between machines, so that you could add virtually unlimited number or them. This scales better at the cost of slow communication between processes.
Deciding which approach is more suitable is itself a very interesting task. And once you make the decision there's no guarantee that it will look stupid to your successors in a couple of years when requirements change or new hardware becomes available.
Upvotes: 7