NullReferenceException, bug in C# socket BeginConnect?

We have a server application that communicates with clients via TCP sockets. After it runs for a few weeks it crashes with an NullReferenceException that can not be handled. I have been able to reproduce the exception with a very small console program, but it seems that there is unhandled exception in internal sockets threadpool. So I can not handle it with any try/catch blocks as it is not in my control.

Does anybody have any idea about this? Is it a framework bug or how can I catch the exception on the socket threadpool (so our application is not crashing) ? Here is the example code that is generating the exception, after a few iterations (3-10). It is important to know that the server is offline, so the socket is not being able to connect. It is used Visual studio 2010 and .Net framework 4.0.

internal class Program
{
    private static string host;

    private static Socket socket;

    private static void Main(string[] args)
    {
        Trace.Listeners.Add(new ConsoleTraceListener());

        AppDomain.CurrentDomain.UnhandledException += new UnhandledExceptionEventHandler(CurrentDomain_UnhandledException);

        socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);

        host = "127.0.0.1";
        //aslo the problem is happening whe the host is other network ip address
        //host = "192.168.0.1";

        //when in other thread doesn not crash application
        //Task.Factory.StartNew(() => StartConnecting());

        //also crashing the application
        //Task.Factory.StartNew(() => StartConnecting(), TaskCreationOptions.LongRunning);

        //when it is regular thread the exception occurs
        ///*
        var thread = new Thread(new ThreadStart(StartConnecting));
        thread.Start();
        //*/

        //when it is blocking exception also occurs
        //StartConnecting();
        Console.WriteLine("Press any key to exit ...");
        Console.ReadKey();
    }

    private static void StartConnecting()
    {
        try
        {
            int count = 0;
            while (true)
            {
                try
                {
                    // if i must switch to Socket.Connect(...)?
                    Trace.WriteLine(string.Format("Connect Try {0} begin", ++count));

                    var ar = socket.BeginConnect(host, 6500, new AsyncCallback(ConnectCallback), socket);

                    Trace.WriteLine(string.Format("Connect Try {0} end", count));
                }
                catch (Exception err)
                {
                    Trace.WriteLine(string.Format("[BeginConnect] error {0}", err.ToString()));
                }
                System.Threading.Thread.Sleep(1000);
                //will see the exception more quick
            }
        }
        catch (Exception e)
        {
            Trace.WriteLine(string.Format("[StartConnecting] error {0}", e.ToString()));
        }
    }

    private static void CurrentDomain_UnhandledException(object sender, UnhandledExceptionEventArgs e)
    {
        string msg = e.ExceptionObject.ToString();

        Trace.WriteLine(string.Format("[CurrentDomain_UnhandledException] isTerminating={0} error {1}", e.IsTerminating, msg));

        Trace.WriteLine("Exiting process");

        //the other processing threads continue working
        //without problems untill there is thread.sleep
        //Thread.Sleep(10000);
    }

    private static void ConnectCallback(IAsyncResult ar)
    {
        try
        {
            Trace.WriteLine("[ConnectCallback] enter");
            var socket = (Socket)ar.AsyncState;
            socket.EndConnect(ar);

            Trace.WriteLine("[ConnectCallback] exit");
        }
        catch (Exception e)
        {
            Trace.WriteLine(string.Format("[ConnectCallback] error {0}", e.ToString()));
        }
    }
}

After the application starts the inevitable crash will occur:

[CurrentDomain_UnhandledException] isTerminating=True error System.NullReferenceException: Object reference not set to an instance of an object.
   at System.Net.Sockets.Socket.ConnectCallback()
   at System.Net.Sockets.Socket.RegisteredWaitCallback(Object state, Boolean timedOut)
   at System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(Object state, Boolean timedOut)

Upvotes: 3

Views: 3195

Answers (3)

Simon Mourier
Simon Mourier

Reputation: 138925

I'm pretty confident this uncatchable error is caused by a bug in the Socket code and you should report it to connect.

Here is an extract from the Socket.cs code at .NET reference source: http://referencesource.microsoft.com/#System/net/System/Net/Sockets/Socket.cs,938ed6a18154d0fc

private void ConnectCallback()
{
  LazyAsyncResult asyncResult = (LazyAsyncResult) m_AcceptQueueOrConnectResult;

  // If we came here due to a ---- between BeginConnect and Dispose
  if (asyncResult.InternalPeekCompleted)
  {
     // etc.
      return;
  }
}

This callback is called by another static method:

private static void RegisteredWaitCallback(object state, bool timedOut)
{
  Socket me = (Socket)state;

  // Interlocked to avoid a race condition with DoBeginConnect
  if (Interlocked.Exchange(ref me.m_RegisteredWait, null) != null)
  {
    switch (me.m_BlockEventBits)
    {
    case AsyncEventBits.FdConnect:
      me.ConnectCallback();
      break;

    case AsyncEventBits.FdAccept:
      me.AcceptCallback(null);
      break;
    }
  }
}

This static method is never unregistered, it's always called, but it relies on a m_RegisteredWait event to determine if it must pass on to the socket member method.

The problem is I suppose this event is sometimes not null while the m_AcceptQueueOrConnectResult can be null, which causes the problem, in an uncatchable thread.

That being said, the root cause of the problem is the fact that your code exhibits problems in the first place as others have noted. To avoid this horrible uncatchable error, just make sure you call Close or Dispose on the socket when error happens and this will internally clear the m_RegisteredWait member. For example, the BeginConnect documentation says this:

To cancel a pending call to the BeginConnect method, close the Socket. When the Close method is called while an asynchronous operation is in progress, the callback provided to the BeginConnect method is called. A subsequent call to the EndConnect method will throw an ObjectDisposedException to indicate that the operation has been cancelled.

In your example, just add the following line to your callback code:

 private static void ConnectCallback(IAsyncResult ar)
    {
        try
        {
         ...
        }
        catch (Exception e)
        {
          if (_socket != null) _socket.Dispose();
        }
    }

Now, you'll still have errors but they will be normal errors.

Upvotes: 1

John Saunders
John Saunders

Reputation: 161773

If you look carefully at the stack trace, you'll see that the NullReferenceException occurs in System.Net.Sockets.Socket.ConnectCallback. If you look at your code, you'll see that you have a method named ConnectCallback.

That's what we call a "coincidence".

Please change the name of your callback method to MyConnectCallback, and change the BeginConnect call to:

var ar = socket.BeginConnect(host, 6500, new AsyncCallback(MyConnectCallback), socket);

See if that changes anything.

If I'm correct, and your ConnectCallback method is never called, then I'm also forced to wonder how your code works at all.

Upvotes: 0

Benoit Blanchon
Benoit Blanchon

Reputation: 14521

The sample code you provided repeatedly calls BeginConnect without waiting for the async operation to complete.

Roughly, you're doing that

while(true)
{
    socket.BeginConnect(...);
    Sleep(1000);
}

So when your thread starts it first calls BeginConnect(), then wait one second, then call BeginConnect() again while the previous call is still executing.

On my computer, it gives me an InvalidOperationException, but I guess the exception type may depend on the CLR version (I'm using .NET 4.5.1).

Here are 3 different solutions:

  1. Cancel the async operation with Socket.EndConnect()
  2. Wait for the async operation to complete with IAsyncResult.AsyncWaitHandle.WaitOne()
  3. Don't use BeginConnect() and use Connect() instead

Upvotes: 1

Related Questions