Cyan
Cyan

Reputation: 13968

Suspend and Resume thread (Windows, C)

I'm currently developing a heavily multi-threaded application, dealing with lots of small data batch to process.

The problem with it is that too many threads are being spawns, which slows down the system considerably. In order to avoid that, I've got a table of Handles which limits the number of concurrent threads. Then I "WaitForMultipleObjects", and when one slot is being freed, I create a new thread, with its own data batch to handle.

Now, I've got as many threads as I want (typically, one per core). Even then, the load incurred by multi-threading is extremely sensible. The reason for this: the data batch is small, so I'm constantly creating new threads.

The first idea I'm currently implementing is simply to regroup jobs into longer serial lists. Therefore, when I'm creating a new thread, it will have 128 or 512 data batch to handle before being terminated. It works well, but somewhat destroys granularity.

I was asked to look for another scenario: if the problem comes from "creating" threads too often, what about "pausing" them, loading data batch and "resuming" the thread?

Unfortunately, I'm not too successful. The problem is: when a thread is in "suspend" mode, "WaitForMultipleObjects" does not detect it as available. In fact, I can't efficiently distinguish between an active and suspended thread.

So I've got 2 questions:

  1. How to detect "suspended thread", so that i can load new data into it and resume it?

  2. Is it a good idea? After all, is "CreateThread" really a ressource hog?

Edit

After much testings, here are my findings concerning Thread Pooling and IO Completion Port, both advised in this post.

Thread Pooling is tested using the older version "QueueUserWorkItem". IO Completion Port requires using CreateIoCompletionPort, GetQueuedCompletionStatus and PostQueuedCompletionStatus;

1) First on performance : Creating many threads is very costly, and both thread pooling and io completion ports are doing a great job to avoid that cost. I am now down to 8-jobs per batch, from an earlier 512-jobs per batch, with no slowdown. This is considerable. Even when going to 1-job per batch, performance impact is less than 5%. Truly remarkable.

From a performance standpoint, QueueUserWorkItem wins, albeit by such a small margin (about 1% better) that it is almost negligible.

2) On usage simplicity : Regarding starting threads : No question, QueueUserWorkItem is by far the easiest to setup. IO Completion port is heavyweight in comparison. Regarding ending threads : Win for IO Completion Port. For some unknown reason, MS provides no function in C to know when all jobs are completed with QueueUserWorkItem. It requires some nasty tricks to successfully implement this basic but critical function. There is no excuse for such a lack of feature.

3) On resource control : Big win for IO Completion Port, which allows to finely tune the number of concurrent threads, while there is no such control with QueueUserWorkItem, which will happily spend all CPU cycles from all available cores. That, in itself, could be a deal breaker for QueueUserWorkItem. Note that newer version of Completion Port seems to allow that control, but are only available on Windows Vista and later.

4) On compatibility : small win for IO Completion Port, which is available since Windows NT4. QueueUserWorkItem only exists since Windows 2000. This is however good enough. Newer version of Completion Port is a no-go for Windows XP.

As can be guessed, I'm pretty much tied between the 2 solutions. They both answer correctly to my needs. For a general situation, I suggest I/O Completion Port, mostly for resource control. On the other hand, QueueUserWorkItem is easier to setup. Quite a pity that it loses most of this simplicity on requiring the programmer to deal alone with end-of-jobs detection.

Upvotes: 2

Views: 4205

Answers (3)

Damon
Damon

Reputation: 70206

If you want to also support Windows XP, you cannot use CreateThreadpool -- otherwise, if Vista and newer is sufficient, Windows thread pools are the easiest way.

If Windows XP support is needed, spawn a number of threads and assign them to an IO completion port, then have each thread block on GetQueuedCompletionStatus(). Completion ports let you post events to the port which will wake exactly one thread per event, and they are very efficient. They use a LIFO strategy on waking threads to keep caches warm, too.

In any case, you will never want to suspend a thread. Never ever. Block, wait, but don't suspend.

The reason is that with suspend you get the problem that you describe, plus you will create deadlocks, e.g. if your thread is within a critical section or mutex. Aside from a debugger, nobody should ever need to suspend a thread.

Upvotes: 1

Hans Passant
Hans Passant

Reputation: 942308

Yes, there's a fair amount of overhead involved with CreateThread. One solution is to use a thread pool, QueueUserWorkItem. Another is to just start a set of threads and have them retrieve a 'job item' from a thread-safe queue.

Upvotes: 3

i_am_jorf
i_am_jorf

Reputation: 54640

Instead of implementing your own, consider using CreateThreadpool(). The OS will do the work for you, and you don't have to worry about getting it right.

Upvotes: 4

Related Questions