Reputation: 7203
I am getting sockets stuck in close_wait when two of my daemons speak to each other. After having read different questions and blog entries on the subject, I have verified that I am closing the socket from both sides (originator and receiver).
The model goes as follows:
Sender: establish connection, send data, wait for confirmation, close connection
Receiver: receive connection, read data, send confirmation, close connection
Can anyone tell me what I'm doing wrong? Note: I am using close() to close the connections right now. I have tried using shutdown as well and it hasn't changed things. Any hints would be greatly appreciated.
EDIT: Shortly after closing the socket, the receiving daemon forks. I have tried passing the file descriptor to the function that forks and explicitly closing it again in the child process, but this did not fix my problem. Is there any other way that forking could affect this process? Note that the sending daemon does not fork.
Upvotes: 0
Views: 6518
Reputation: 6523
Actually these are quite common problems witnessed in multi-threaded server applications There are two things you could do to resolve this problem:
The code for implementation of both of the above solutions can be a little different on *NIX and Microsoft. The difference is only due to semantic differences.
I would recommend implementing both of the above measures.
However if you cannot modify the code then you could use libkeepalive
Upvotes: 0
Reputation: 7203
After looking in wireshark, I saw that the final FIN_ACK said:
"[TCP ACKed lost segment] [TCP previous segment lost] ..."
It turns out that my problem was caused by having both daemons running on the same box (something we had added for testing). After trying again on multiple boxes, we no longer get this problem.
Upvotes: 1
Reputation: 3017
In my (short) experience, it's very possible that you're closing the wrong fd, or even not reaching the "close" statement at all. I stumbled upon the later one and the first clue was that my application became a zombie instead of closing (specifically a simple printf right before the close statement made it all go to hell).
Might be worth your time to check the task manager/jobs/system monitor/< some process view name relevant to your OS>.
Upvotes: 0
Reputation: 1354
when you have an application which has opened a socket and after doing some send receive it accepts a FIN from its peer, from that states onwards it goes to CLOSE_WAIT state. It can remain in that state forever until you explicitly call close(). Hope you are actually passing the right FD in close().
Upvotes: 0