Reputation: 397
UPDATE : After investigating lil more I found the real problem for this behavior . Problem is, I am creating the threads for each connection and passing the sock fd to the thread but was not pthraed_joining immediately so that made my main thread not to able to create any more threads after the connection acceptance. and my logic of closing the socket is in child thread, coz of that i was not able to close the socket and hence they were going to WAIT CLOSE state. SO I just detached the threads after creating them and all works well as of now !!
I have a client server program, I am using a script to run the client and make as many as connections possible and close them after sending a line of data and exit the client, every thing works fine until 32739 th connection i.e. connection is closed on both the sides and all but after that number the connection is not getting closed and server stops taking any more connections and if do
netstat -tonpa 2>&1 | grep CLOSE
I see around 1020 sockets waiting for CLOSE. sample out of the command,
tcp 25 0 192.168.0.175:16099 192.168.0.175:41704 CLOSE_WAIT 5250/./bl_manager off (0.00/0/0)
tcp 24 0 192.168.0.175:16099 192.168.0.175:41585 CLOSE_WAIT 5250/./bl_manager off (0.00/0/0)
tcp 30 0 192.168.0.175:16099 192.168.0.175:41679 CLOSE_WAIT 5250/./bl_manager off (0.00/0/0)
tcp 31 0 192.168.0.175:16099 192.168.0.175:41339 CLOSE_WAIT 5250/./bl_manager off (0.00/0/0)
tcp 25 0 192.168.0.175:16099 192.168.0.175:41760 CLOSE_WAIT 5250/./bl_manager off (0.00/0/0)
I am using following code to detect the client disconnection.
for(fd = 0; fd <= fd_max; fd++) {
if(FD_ISSET(fd, &testfds)) {
if (fd == client_fd) {
ioctl(fd, FIONREAD, &nread);
if(nread == 0) {
FD_CLR(fd, &readfds);
close(fd);
return 0;
}
}
}
} /* for()*/
Please do let me know if am doing anything wrong. Its a Python client and CPP server setup.
thank you
Upvotes: 0
Views: 266
Reputation: 365915
Without knowing your platform, I can't be sure, but the fact that you're clearly using select
, and you're having a problem only a few dozen away from 32768, it seems very likely that this is your problem.
An fd_set
is a collection of bits, indexed by file descriptor numbers. Every platform has a different max number. OpenBSD and recent versions of FreeBSD and OS X usually limit fd_set to an FD_SETSIZE
that defaults to 1024. Different linux boxes seem to have 1024, 4096, 32768, and 65536.
So, what happens if you FD_ISSET(32800, &testfds)
and FD_SETSIZE
is 32768? You're asking it to read a bit from arbitrary memory.
A select
or other call before this should give you an EINVAL error when you pass in 32800 for the nfds
parameter… but historically, many platforms have not done so. Or they have returned an error, but only after filling in the first FD_SETSIZE
bits properly and leaving the rest set to uninitialized memory, which means if you forget to check the error, your code seems to work until you stress it.
This is one of the reasons using select
for more than a few hundred sockets is a bad idea. The other reason is that select
is linear (and, worse, not linear on the number of current sockets, but linear on the highest fd, so even after most clients go away it's still slow).
Most modern platforms that have select
also have poll
, which avoids that problem.
Unless you're on Windows… in which case there are completely different reasons not to use select
, and different answers.
Upvotes: 1
Reputation: 310985
CLOSE-WAIT means the port is waiting for the local application to close the socket, having already received a close from the peer. Clearly you are leaking sockets somehow, possibly in an error path.
Your code to 'detect client disconnection' is completely incorrect. All you are testing is the amount of data that can be read without blocking, i.e. that has already arrived. The correct test is a return value of zero from recv() or an error other than EAGAIN/EWOULDBLOCK when reading or writing.
Upvotes: 2