Marcus Vinicius
Marcus Vinicius

Reputation: 1908

C# TCP Server stop receiving client messages, resumes when service is restarted

I working in a managed Windows Service written with C#. It keeps receiving messages from several clients connected over TCP/IP. The Client is basically a router that receive and resend messages from thermometers to the Server. The Server parse the messages and store them in a SQL Server database.

The problem I am facing is that some clients, suddenly, stops sending messages. But, as soon the service is restarted, they connect again and resume sending. I don't have the code of the Client since it is a third party device and I pretty sure the problem is with the Server.

I manage to reduce the problem by implementing a timer that keeps checking if each client is still connected (see code below). Also, I added a Keep Alive mode to the Socket, using the socket.IOControl(IOControlCode.KeepAliveValues, ...) method, but the problem still happening.

I'm posting some code from specific parts I consider relevant. But, if more snippets are needed to understand the problem, please ask me and I'll edit the post. All the try/catch blocks was removed to reduce the ammount of code.

I don't want a perfect solution, just any guidance will be appreciated.

private Socket _listener;
private ConcurrentDictionary<int, ConnectionState> _connections;

public TcpServer(TcpServiceProvider provider, int port)
{
    this._provider = provider;
    this._port = port;
    this._listener = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
    this._connections = new ConcurrentDictionary<int, ConnectionState>();

    ConnectionReady = new AsyncCallback(ConnectionReady_Handler);
    AcceptConnection = new WaitCallback(AcceptConnection_Handler);
    ReceivedDataReady = new AsyncCallback(ReceivedDataReady_Handler);
}                

public bool Start()
{    
    this._listener.Bind(new IPEndPoint(IPAddress.Any, this._port));
    this._listener.Listen(10000);
    this._listener.BeginAccept(ConnectionReady, null);    
}

// Check every 5 minutes for clients that have not send any message in the past 30 minutes
// MSG_RESTART is a command that the devices accepts to restart
private void CheckForBrokenConnections()
{
    foreach (var entry in this._connections)
    {
        ConnectionState conn = entry.Value;

        if (conn.ReconnectAttemptCount > 3)
        {
            DropConnection(conn);
            continue;
        }

        if (!conn.Connected || (DateTime.Now - conn.LastResponse).TotalMinutes > 30)
        {
            byte[] message = HexStringToByteArray(MSG_RESTART);

            if (!conn.WaitingToRestart && conn.Write(message, 0, message.Length))
            {
                conn.WaitingToRestart = true;                    
            }
            else
            {
                DropConnection(conn);                
            }
        }
    }        
}


private void ConnectionReady_Handler(IAsyncResult ar)
{    
    lock (thisLock)
    {
        if (this._listener == null)
            return;

        ConnectionState connectionState = new ConnectionState();
        connectionState.Connection = this._listener.EndAccept(ar);

        connectionState.Server = this;
        connectionState.Provider = (TcpServiceProvider)this._provider.Clone();
        connectionState.Buffer = new byte[4];
        Util.SetKeepAlive(connectionState.Connection, KEEP_ALIVE_TIME, KEEP_ALIVE_TIME);
        int newID = (this._connections.Count == 0 ? 0 : this._connections.Max(x => x.Key)) + 1;
        connectionState.ID = newID;
        this._connections.TryAdd(newID, connectionState);

        ThreadPool.QueueUserWorkItem(AcceptConnection, connectionState);

        this._listener.BeginAccept(ConnectionReady, null);
    }
}

private void AcceptConnection_Handler(object state)
{    
    ConnectionState st = state as ConnectionState;
    st.Provider.OnAcceptConnection(st);

    if (st.Connection.Connected)
        st.Connection.BeginReceive(st.Buffer, 0, 0, SocketFlags.None, ReceivedDataReady, st);    
}

private void ReceivedDataReady_Handler(IAsyncResult result)
{
    ConnectionState connectionState = null;

    lock (thisLock)
    {
        connectionState = result.AsyncState as ConnectionState;
        connectionState.Connection.EndReceive(result);

        if (connectionState.Connection.Available == 0)
            return;

        // Here the message is parsed
        connectionState.Provider.OnReceiveData(connectionState);

        if (connectionState.Connection.Connected)
            connectionState.Connection.BeginReceive(connectionState.Buffer, 0, 0, SocketFlags.None, ReceivedDataReady, connectionState);
    }
}

internal void DropConnection(ConnectionState connectionState)
{
    lock (thisLock)
    {
        if (this._connections.Values.Contains(connectionState))
        {
            ConnectionState conn;
            this._connections.TryRemove(connectionState.ID, out conn);
        }

        if (connectionState.Connection != null && connectionState.Connection.Connected)
        {
            connectionState.Connection.Shutdown(SocketShutdown.Both);
            connectionState.Connection.Close();
        }
    }
}

Upvotes: 1

Views: 2929

Answers (3)

Alon Catz
Alon Catz

Reputation: 2530

I have experienced these kind of situation many times. The problem is probably not with your code at all but with the network and the way Windows (on boths ends) or the routers handle the network. What happens quite often is that a temporary network outage "breaks" the socket, but Windows isn't aware of it, so it doesn't close the socket.

The only way to overcome this is exactly what you did - sending keep-alives and monitoring connection health. Once you recognize the the connection is down, you need to restart it. However, in your code you don't restart the listener socket which is also broken and can't accept new connections. That's why restarting the service helps, it restarts the listener.

Upvotes: 1

Kevin Seifert
Kevin Seifert

Reputation: 3572

Add try/catch blocks around all the IO calls, and write the errors to a log file. As it is, it can't recover on error.

Also, be careful with any lock that doesn't have a timeout. These operations should be given a reasonable TTL.

Upvotes: 1

JimR
JimR

Reputation: 16113

2 things I think I see...

  • If this is a connection you keep for multiple messages, you probably should not return from ReceivedDataReady_Handler when connectionState.Connection.Available == 0 IIRC a 0 length data paket can be received. So if the connection is still open, you should call connectionState.Connection.BeginReceive( ... ) before leaving the handler.

  • (I hesitate to put this here because I do not remember specifics) There is an event you can handle that tells you when things happen to your underlying connection including errors and failures connecting or closing a connection. For the life of me I cannot remember the name(s)... This would likely be more efficient than a timer every few seconds. It also gives you a way to break out of connections stuck in the connecting or closing states.

Upvotes: 2

Related Questions