dmoon1221
dmoon1221

Reputation: 297

Can two threads interact simultaneously with two different sockets on the same port?

Suppose I have a multithreaded server writing data to two different sockets on the same port, where a dedicated thread handles each socket. Is it possible for the two threads to write simultaneously to their respective sockets? (By "simultaneously", I mean true simultaneity, not just concurrent interleaving.) Or does the fact that the sockets share the same port mean mutual exclusion is enforced?

In general, I'm unclear about how resources are shared between two arbitrary I/O streams. I understand that two threads cannot write simultaneously to disk because the disk itself is a shared resource. In the case of sockets and ports, however, I don't have a similar physical model to guide my reasoning. A related question would be whether there are shared resources between I/O streams of different types -- for example, would there be any contention between two threads writing to two file descriptors, one for a network socket, the other for a file on disk?

Upvotes: 1

Views: 2037

Answers (1)

Myst
Myst

Reputation: 19221

Although mutual exclusion is enforced, that isn't overly helpful and could result is interleaved writing... let me explain.

From the linux man read(3):

I/O is intended to be atomic to ordinary files and pipes and FIFOs. Atomic means that all the bytes from a single operation that started out together end up together, without interleaving from other I/O operations. It is a known attribute of terminals that this is not honored, and terminals are explicitly (and implicitly permanently) excepted, making the behavior unspecified. The behavior for other device types is also left unspecified, but the wording is intended to imply that future standards might choose to specify atomicity (or not).

Sockets count as FIFO.

Moreover, sockets share the same underlying hardware, so all sockets on the same ethernet interface will have to be synchronized to pass through to the actual network (even among different sockets you will have synchronization happening).

So, technically, you can write from two or more threads (or even processes) without clobbering the IO layer and causing a whole mess of things.

However, you should notice that not all the bytes a guarantied to be written to the fd. As mentioned in the man for write:

write() writes up to count bytes from the buffer pointed buf to the file referred to by the file descriptor fd.

This may lead to fragmentation when two (or more) processes/threads try to write to the same socket.

(EDIT - see comments) moreover, large calls to write could be internally fragmented into multiple write calls (according to the internal buffer for the fd). This could result in the internal file lock being released and re-acquired in a way that could result in interleaved data being written.

For instance, assume the following scenario:

Thread1 calls `write(fd, "Hello long message", 18);`

Context switch.

Thread2 calls `write(fd, "Hello another message", 20);`

Context switch.

Thread1 gets the return value of 7 (there wasn't enough room in the buffer for the long message), meaning only some off the message was sent.

Context switch (system does some stuff too).

Thread1 gets the return value of 10 (there wasn't enough room in the buffer for the long message), meaning only some off the message was sent.

In this scenario, the client got something like "Hello lHello anot" (or maybe "Hello anotHello l", as the order of write operations probably isn't guarantied in this case) which is NOT the intended effect...

So even though the OS guaranties that both read and write can be deemed as "atomic" (we didn't get "HelHellolo..."), you will still find yourself in need of a user land buffer and some sort of synchronizer to manage parallel writes.

If you're feeling lazy, I wrote a small two file library for common multithreaded sock operations for my project. I don't think I tested it for two parallel thread writes just yet, but I will once I implement HTTP/2.

This library isn't the most performance oriented code I could think of, but it might give you a place to start.

P.S.

From your question it seems you're writing a server application that uses a thread per client... if that's the case, I urge you to reconsider.

Heavy loads to any machine running such a server application result in the whole machine crashing when a DoS attack manages to cause a high enough amount of threads to be spawned. The machine get's stuck performing more contexts switches then tasks until it's just context switching while doing nothing (except, maybe, burning the CPU).

Edit (answering the question in the comments)

Basically, in the underlying implementation, each fd is assigned a "lock" (I'm simplifying, but it's basically how it's done).

read, write and any I/O operation that is considered atomic will attempt to acquire the lock before any operation is performed. (EDIT:) However, internally the lock may be released and re-acquired when flushing the internal fd's buffer.

This means that these operations are atomic only with respect to the same fd. So, any two I/O operations that effect the same fd will synchronize with each other and give the impression of being atomic.

On a lower level, since sockets share the same hardware (assuming you have a single network interface), tcp packets are synchronized as they are sent, so no tcp packet fragmentation will occur.

This is, however, a different synchronization issue then the I/O C API, since the I/O provided by the OS (C API) writes to an intermediate internal buffer. (EDIT): when that buffer is full, the internal fd lock may be released while the buffer flushes, resulting in possible interleaved data.

The kernel manages this buffer as it writes to files, sockets, pipes, etc' - each one with their own unique synchronization concerns.. i.e. non SSD hard disks need to synchronize their writes with the disk rotation and couldn't really provide good hardware concurrency.

Kernel issues, hardware issues and API constraints are all different levels where concurrency would be lost or achieved. Some operations that lose concurrency gain performance through hardware accelerations...

... in the end, as software developers, we do our best on our end, and hope the kernel and hardware do their best on their end.

Upvotes: 1

Related Questions