Reputation: 1654
The reason I need this is because I need to poll data from 1000 devices that have GPRS modem connected to them and client protocol implementation for this devices has blocking API e.g.:
<data> = protocol.get_<some_data>(stream)
All get methods are blocking: they can only return data or raise exception. Stream is a TCP-socket connection that was established from GPRS modem to our app. Protocol is implemented in python. Protocol complexity is hard to describe there are about 100 different types of devices that have specific features and get methods are aware of this so protocol implementation is extremely complex to port it for example to go or erlang (I will ask such amount of money for this that my boss will cry). So the question can sound like how to maintain 1000 threads in python. I know that this amount if far beyond the python possibilities not only because of GIL (I use CPython at the moment) but also because OS will feel that third world war has began (I planned to pull all this on one server machine).
Upvotes: 1
Views: 648
Reputation: 73304
Assuming that you absolutely must use only blocking I/O (e.g. because you have an existing codebase that would be too expensive to rewrite to use non-blocking I/O), the easiest thing to do would be to simply spawn 1000 threads. Most OS's can handle that many threads (albeit not necessarily all that efficiently), and the GIL will not be a problem because a thread that is blocked waiting for I/O does not hold the GIL. (GIL is a problem only when you are trying to get a speedup by parallelizing CPU-bound computations; and it sound like all of your threads will be I/O-bound)
If you find that a process with 1000 threads is in fact too many threads for your OS of choice too handle acceptably, you could always break up the threads into multiple processes (e.g. 10 processes with 100 threads each, or whatever other ratio you find works best). Then if the problem turns out to be a global-thread-limit problem (e.g. 1000 threads is too many, regardless of how many processes you spread them out over), then next thing you could do is spread them across multiple computers (e.g. 10 computers running 100 threads each).
These are all kind of ugly solutions, though; the real solution would be to rewrite the program to use non-blocking I/O so that each thread could handle a (potentially large) number of sockets concurrently. If you haven't already read it, you might want to read the C10K problem article on the subject of supporting many concurrent TCP connections well.
Upvotes: 1