Simon MᶜKenzie
Simon MᶜKenzie

Reputation: 8693

ManualResetEvent.WaitOne() doesn't return if Reset() is called immediately after Set()

I have a problem in a production service which contains a "watchdog" timer used to check whether the main processing job has become frozen (this is related to a COM interop problem which unfortunately can't be reproduced in test).

Here's how it currently works:

The cause appears to be that when multiple items await processing, the time between the Set() after the first item is processed, and the Reset() before the second item is processed is too brief, and WaitOne() doesn't appear to recognise that the event has been set.

My understanding of WaitOne() is that the blocked thread is guaranteed to receive a signal when Set() is called, but I assume I'm missing something important.

Note that if I allow a context switch by calling Thread.Sleep(0) after calling Set(), WaitOne() never fails.

Included below is a sample which produces the same behaviour as my production code. WaitOne() sometimes waits for 5 seconds and fails, even though Set() is being called every 800 milliseconds.

private static ManualResetEvent _handle;

private static void Main(string[] args)
{
    _handle = new ManualResetEvent(true);

    ((Action) PeriodicWait).BeginInvoke(null, null);
    ((Action) PeriodicSignal).BeginInvoke(null, null);

    Console.ReadLine();
}

private static void PeriodicWait()
{
    Stopwatch stopwatch = new Stopwatch();

    while (true)
    {
        stopwatch.Restart();
        bool result = _handle.WaitOne(5000, false);
        stopwatch.Stop();
        Console.WriteLine("After WaitOne: {0}. Waited for {1}ms", result ? "success" : "failure",
                            stopwatch.ElapsedMilliseconds);
        SpinWait.SpinUntil(() => false, 1000);
    }
}

private static void PeriodicSignal()
{
    while (true)
    {
        _handle.Reset();
        Console.WriteLine("After Reset");
        SpinWait.SpinUntil(() => false, 800);
        _handle.Set();
        // Uncommenting either of the lines below prevents the problem
        //Console.WriteLine("After Set");
        //Thread.Sleep(0);
    }
}

output of the above code


The Question

While I understand that calling Set() closely followed by Reset() doesn't guarantee that all blocked threads will resume, is it also not guaranteed that any waiting threads will be released?

Upvotes: 7

Views: 19734

Answers (2)

Stephen Cleary
Stephen Cleary

Reputation: 457472

You can't "pulse" an OS event like this.

Among other issues, there's the fact that any OS thread performing a blocking wait on an OS handle can be temporarily interrupted by a kernel-mode APC; when the APC finishes, the thread resumes waiting. If the pulse happened during that interruption, the thread doesn't see it. This is just one example of how "pulses" can be missed (described in detail in Concurrent Programming on Windows, page 231).

BTW, this does mean that the PulseEvent Win32 API is completely broken.

In a .NET environment with managed threads, there's even more possibility of missing a pulse. Garbage collection, etc.

In your case, I would consider switching to an AutoResetEvent which is repeatedly Set by the working process and (automatically) reset by the watchdog process each time its Wait completes. And you'd probably want to "tame" the watchdog by only having it check every minute or so.

Upvotes: 8

Hans Passant
Hans Passant

Reputation: 942538

No, this is fundamentally broken code. There are only reasonable odds that the WaitOne() will complete when you keep the MRE set for such a short amount of time. Windows favors releasing a thread that's blocked on an event. But this will drastically fail when the thread isn't waiting. Or the scheduler picks another thread instead, one that runs with a higher priority and also got unblocked. Could be a kernel thread for example. MRE doesn't keep a "memory" of having been signaled and not yet waited on.

Neither Sleep(0) or Sleep(1) are good enough to guarantee that the wait is going to complete, there's no reasonable upper bound on how often the waiting thread could be bypassed by the scheduler. Although you probably ought to shut down the program when it takes longer than 10 seconds ;)

You'll need to do this differently. A simple way is to rely on the worker to eventually set the event. So reset it before you start waiting:

private static void PeriodicWait() {
    Stopwatch stopwatch = new Stopwatch();

    while (true) {
        stopwatch.Restart();
        _handle.Reset();
        bool result = _handle.WaitOne(5000);
        stopwatch.Stop();
        Console.WriteLine("After WaitOne: {0}. Waited for {1}ms", result ? "success" : "failure",
                            stopwatch.ElapsedMilliseconds);
    }
}

private static void PeriodicSignal() {
    while (true) {
        _handle.Set();
        Thread.Sleep(800);   // Simulate work
    }
}

Upvotes: 14

Related Questions