user1537941
user1537941

Reputation: 11

Winsock send call is very slow

I am making a gameserver (TCP) which might need to handle 3000+ connections. Currently its having 50 connections and i am already experiencing lags.

I found out that its the winsock send() calls that takes about 100~300ms each to return, which pretty much slow down the whole server as it is a single threaded server.

So far I've thought about two solutions.

  1. Redesign my server to create a thread for each client (Is it really stable to create 3000+ threads for 3000 clients?).
  2. Find a way to make the send() call to return immediately.

This is my socket init code:

int ret = WSAStartup(MAKEWORD(2,2), &wsaData);
if(ret != 0) 
{ 
    printf("Winsock failed to start.\n");
    system("pause");
    return; 
}

server.sin_family = AF_INET;
server.sin_addr.s_addr = INADDR_ANY;
server.sin_port = htons(52000);

sock = socket(AF_INET, SOCK_STREAM, 0);
if(sock == INVALID_SOCKET) 
{ 
    printf("Invalid Socket.\n");
    system("pause");
    return; 
}

if(bind(sock, (sockaddr*)&server, sizeof(server)) != 0)
{ 
    printf("error");
    return; 
}

if(listen(sock, 5) != 0)
{
    printf("error");
    return;
}

Accepting code in a separate thread

sockaddr_in from;
int fromlen = sizeof(from);
SOCKET sTmpSocket = accept(ServerSock, (struct sockaddr*)&from, &fromlen);

My send function

void CClient::SendPacket(BYTE* pPacket, DWORD Len)
{
    DWORD ToSendLen = Len;
    while (ToSendLen)
    {
        int iResult = send(this->ClientSocket, (char*)(pPacket + Len - ToSendLen), ToSendLen, 0);

        if (iResult == SOCKET_ERROR)
            return;

        ToSendLen -= iResult;
    }
}

This is my server thread (incomplete, only relevant part)

while (1)
{
    for (int i = 0; i < MAX_CLIENT; i++)
    {
        if (Clients[i].bConnected == false)
            continue;

        timeval timeout;
        timeout.tv_sec = 0;
        timeout.tv_usec = 1;
        fd_set socketset;
        socketset.fd_count = 1;
        socketset.fd_array[0] = Clients[i].ClientSocket;
        if (select(0, &socketset, 0, 0, &timeout))
        {
            int RecvLen = recv(Clients[i].ClientSocket, (char*)pBuffer, 10000, MSG_PEEK);
            if (RecvLen == SOCKET_ERROR)
            {
                Clients[i].bConnected = false;
                iNumClients--;
            }
            else if (RecvLen == 0)
            {
                Clients[i].bConnected = false;
                iNumClients--;
            }
            else if (RecvLen > 0)
            {
                // Packet handling here
                recv(Clients[i].ClientSocket, (char*)pBuffer, dwDataLen, MSG_WAITALL);

                //...
            }
        }
    }

    Sleep(1);
}

Any help would be greatly appreciated. Thank you.

Upvotes: 1

Views: 3705

Answers (4)

Martin James
Martin James

Reputation: 24857

If you are only sending small packets, set the the TCP_NODELAY option with setsockopt(), to start with.

Upvotes: 0

markh44
markh44

Reputation: 6080

Another approach to consider is to have multiple worker threads handling a proportion of the active connections and make this configurable. For example you could create n threads per core and then experiment with n.

This should give you a lot of the benefits of multi-threading without going to the extreme of one thread per client.

Upvotes: 0

Some programmer dude
Some programmer dude

Reputation: 409176

One thing comes to mind immediately, and that is to use non-blocking sockets and check if the sockets is writable with select.

As a side-note, you are using the return value from select wrong. It returns 0 on timeout, a positive number if there are sockets ready in any set, and -1 if there is an error. You do not check for the error. Also, while not needed in the winsock version, the first argument to select should really be the highest socket number plus one.

Another point, you are only using select for one client at a time, add all clients to the set and then do a select:

fd_set readset;
FD_ZERO(&readset);
SOCKET max_socket = 0;

for (int i = 0; i < MAX_CLIENT; i++)
{
    if (Clients[i].bConnected)
    {
        FD_SET(Clients[i].ClientSocket, &readset);
        max_socket = std::max(max_socket, Clients[i].ClientSocket);
    }
}

int res = select(max_socket + 1, &readset, 0, 0, &timeout);
if (res == SOCKET_ERROR)
    std::cout << "Error #" << WSAGetLastError() << '\n';
else if (res > 0)
{
    for (int i = 0; i < MAX_CLIENT; i++)
    {
        if (Clients[i].bConnected && FD_ISSET(Clients[i].ClientSocket, &readset))
        {
            // Read from client
        }
    }
}

Upvotes: 0

Simon
Simon

Reputation: 1504

I would go for IOCP (IO completion ports) which can bring the desired performance and which scales well. Take a look at Boost.Asio which is using IOCP under the hood (on windows).

The idea to have 3000 and more threads is very bad and really don't scale (in terms of memory consumption and context switches)!

Upvotes: 4

Related Questions