Reputation: 3065
I have been reading about TCP packet and how they can be split up any number of times during their voyage. I took this to assume I would have to implement some kind of buffer on top of the buffer used for the actual network traffic in order to store each ReceiveAsync()
until enough data is available to parse a message. BTW, I am sending length-prefixed, protobuf-serialized messages over TCP.
Then I read that the lower layers (ethernet?, IP?) will actually re-assemble packets transparently.
My question is, in C#, am I guaranteed to receive a full "message" over TCP? In other words, if I send 32 bytes, will I necessarily receive those 32 bytes in "one-go" (one call to ReceiveAsync()
)? Or do I have to "store" each receive until the number of bytes received is equal to the length-prefix?
Also, could I receive more than one message in a single call to ReceiveAsync()
? Say one "protobuf message" is 32 bytes. I send 2 of them. Could I potentially receive 48 bytes in "one go" and then 16 in another?
I know this question shows up easily on google, but I can never tell if it's in the correct context (talking about the actual TCP protocol, or how C# will expose network traffic to the programmer).
Thanks.
Upvotes: 1
Views: 1854
Reputation: 456342
TCP is a stream-based octet protocol. So, from the application's perspective, you can only read or write bytes to the stream.
I have been reading about TCP packet and how they can be split up any number of times during their voyage.
TCP packets are a network implementation detail. They're used for efficiency (it would be very inefficient to send one byte at a time). Packet fragmentation is done at the device driver / hardware level, and is never exposed to applications. An application never knows what a "packet" is or where its boundaries are.
I took this to assume I would have to implement some kind of buffer on top of the buffer used for the actual network traffic in order to store each ReceiveAsync() until enough data is available to parse a message.
Yes. Because "message" is not a TCP concept. It's purely an application concept. Most application protocols do define a kind of "message" because it's easier to reason about.
Some application protocols, however, do not define the concept of a "message"; they treat the TCP stream as an actual stream, not a sequence of messages.
In order to support both kinds of application protocols, TCP/IP APIs have to be stream-based.
BTW, I am sending length-prefixed, protobuf-serialized messages over TCP.
That's good. Length prefixing is much easier to deal with than the alternatives, IMO.
My question is, in C#, am I guaranteed to receive a full "message" over TCP?
No.
Or do I have to "store" each receive until the number of bytes received is equal to the length-prefix? Also, could I receive more than one message in a single call to ReceiveAsync()?
Yes, and yes.
Even more fun:
For more information on the details, see my TCP/IP .NET FAQ, particularly the sections on message framing and some example code for length-prefixed messages.
I strongly recommend using only asynchronous APIs in production; the synchronous alternative of having two threads per connection negatively impacts scalability.
Oh, and I also always recommend using SignalR if possible. Raw TCP/IP socket programming is always complex.
Upvotes: 1
Reputation: 26989
My question is, in C#, am I guaranteed to receive a full "message" over TCP?
No. You will not receive a full message. A single send does not result in a single receive. You must keep reading on the receiving side until you have received everything you need.
See the example here, it keeps the read data in a buffer and keeps checking to see if there is more data to be read:
private static void ReceiveCallback(IAsyncResult ar)
{
try
{
// Retrieve the state object and the client socket
// from the asynchronous state object.
StateObject state = (StateObject)ar.AsyncState;
Socket client = state.workSocket;
// Read data from the remote device.
int bytesRead = client.EndReceive(ar);
if (bytesRead > 0)
{
// There might be more data, so store the data received so far.
state.sb.Append(Encoding.ASCII.GetString(state.buffer, 0, bytesRead));
// Get the rest of the data.
client.BeginReceive(state.buffer, 0, StateObject.BufferSize, 0,
new AsyncCallback(ReceiveCallback), state);
}
else
{
// All the data has arrived; put it in response.
if (state.sb.Length > 1)
{
response = state.sb.ToString();
}
// Signal that all bytes have been received.
receiveDone.Set();
}
}
catch (Exception e)
{
Console.WriteLine(e.ToString());
}
}
See this MSDN article and this article for more details. The 2nd link goes into more details and it also has sample code.
Upvotes: 0
Reputation: 137398
TCP is a stream protocol - it transmits a stream of bytes. That's all. Absolutely no message framing / grouping is implied. In fact, you should forget that Ethernet packets or IP datagrams even exist when writing code using a TCP socket.
You may find yourself with 1 byte available, or 10,000 bytes available to read. The beauty of the (synchronous) Berkeley sockets API is that you, as an application programmer don't need to worry about this. Since you're using a length-prefixed message format (good job!) simply recv()
as many bytes as you're expecting. If there are more bytes available than the application requests, the kernel will keep the rest buffered until the next call. If there are fewer bytes available than required, the thread will either block or the call will indicate that fewer bytes were received. In this case, you can simply sleep again until data is available.
The problem with async APIs is that it requires the application to track a lot more state itself. Even this Microsoft example of Asynchronous Client Sockets is far more complicated than it needs to be. With async APIs, you still control the amount of data you're requesting from the kernel, but when your async callback is fired, you then need to know the next amount of data to request.
Note that the C# async
/await
in 4.5 make asynchronous processing easier, as you can do so in a synchronous way. Have a look at this answer where the author comments:
Socket.ReceiveAsync is a strange one. It has nothing to do with async/await features in .net4.5. It was designed as an alternative socket API that wouldn't thrash memory as hard as BeginReceive/EndReceive, and only needs to be used in the most hardcore of server apps.
Upvotes: 1