Reputation: 367
I’d like to reopen a previous issue that was incorrectly classified as a network engineering problem and after more test, I think it’s a real issue for programmers.
So, my application streams mp3 files from a server. I can’t modify the server. The client reads data from the server as needed which is 160kbits/s and feeds it to a DAC. Let’s use a file the file of 3.5MB.
When the server is done sending last byte, it closes the connection, so it sends a FIN, seems normal practice.
The problem is that the kernel, especially on Windows, seems to store 1 to 3 MB of data, I assume TCP window size has fully opened.
After a few seconds, the server has sent the whole 3.5 MB and about 3MB sit inside the kernel buffer. At this point the server has sent FIN which is ACK in due time.
From a client point of view, it continues reading data by chunk of 20kB and will do that for the next 3MB/20 ~= 150s before it sees the EOF.
Meanwhile the server is in FIN_WAIT_2 (and not TIME_WAIT as I initially wrote, thank to Steffen for correcting me. Now, OS like Windows seems to have a half-closed socket timer that starts with sending their FIN and be as small as 120s, regardless of the actual TCPWindowsize BTW). Of course after 120s it considers that it should have received a client’s FIN, so it sends a RST. That RST cause all client’s kernel buffer to be discarded and the application fails.
As code is required, here is:
int sock = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr;
addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
addr.sin_family = AF_INET;
addr.sin_port = htons(80);
int res = connect(sock, (const struct sockaddr*) & addr, sizeof(addr));
char* get = "GET /data-3 HTTP/1.0\n\r"
"User-Agent: mine\n\r"
"Host: localhost\n\r"
"Connection: close\n\r"
"\n\r\n\r";
bytes = send(sock, get, strlen(get), 0);
printf("send %d\n", bytes);
char *buf = malloc(20000);
while (1) {
int n = recv(sock, buf, 20000, 0);
if (n == 0) {
printf(“normal eof at %d”, bytes);
close(sock);
break;
}
if (n < 0) {
printf(“error at %d”, bytes);
exit(1);
}
bytes += n;
Sleep(n*1000/(160000/8));
}
free(buf);
closesocket(sock);
It can be tested with any HTTP server.
I know there are solutions by having a handshake with the server before it closes the socket (but server is just an HTTP server) but the kernel level of buffering make that a systematic failure when its buffer are larger than the time to consume them.
The client is perfectly real time in absorbing data. Having a larger client buffer or no buffer at all does not change the issue which seems a system design flaw to me, unless there is possibility to either control kernel buffers, at the application level, not the whole OS, or detect a FIN reception at client level before the EOF of recv(). I’ve tried to change SO_RCVBUF but it does not seems to influence logically this level of kernel buffering.
Here is a capture of one successful and one failed exchange
success
3684 381.383533 192.168.6.15 192.168.6.194 TCP 54 [TCP Retransmission] 9000 → 52422 [FIN, ACK] Seq=9305427 Ack=54 Win=262656 Len=0
3685 381.387417 192.168.6.194 192.168.6.15 TCP 60 52422 → 9000 [ACK] Seq=54 Ack=9305428 Win=131328 Len=0
3686 381.387417 192.168.6.194 192.168.6.15 TCP 60 52422 → 9000 [FIN, ACK] Seq=54 Ack=9305428 Win=131328 Len=0
3687 381.387526 192.168.6.15 192.168.6.194 TCP 54 9000 → 52422 [ACK] Seq=9305428 Ack=55 Win=262656 Len=0
failed
5375 508.721495 192.168.6.15 192.168.6.194 TCP 54 [TCP Retransmission] 9000 → 52436 [FIN, ACK] Seq=5584802 Ack=54 Win=262656 Len=0
5376 508.724054 192.168.6.194 192.168.6.15 TCP 60 52436 → 9000 [ACK] Seq=54 Ack=5584803 Win=961024 Len=0
6039 628.728483 192.168.6.15 192.168.6.194 TCP 54 9000 → 52436 [RST, ACK] Seq=5584803 Ack=54 Win=0 Len=0
Upvotes: 1
Views: 308
Reputation: 367
Here is what I think is the cause, thanks very much to Steffen for putting me on the right track.
That's it. That can be reproduced with a few lines of code and every HTTP server. This can be debated, but I see that as a systemic OS issue. Now, the solution that seems to work is to force client's receive buffers (SO_RCVBUF) to a lower level so that the server has little chances to have sent all data and that data sits in client's kernel buffers for too long. Note that this can still happen though if the buffer is 20kB and the client consumes it at 1B/s... hence I call it a systemic failure instead. Now I agree that some will see that as an application issue
Upvotes: 1