Avega
Avega

Reputation: 186

Can std::async "reuse" threads?

As described in title I'd like to know whether tasks ran with std::async can "reuse" idle threads.

For example lets take next code:

auto task = []() { std::this_thread::sleep_for(std::chrono::seconds(20)); };
int tasksCount = 160;
std::vector<std::future<void>> futures;

for (int i = 0; i < tasksCount; ++i)
{
    futures.push_back(std::async(task));
}

So we have a lot of tasks (160) runned in parallel which do nothing. When this code is running on windows it generates 161 waiting threads.

Isn't it too much threads for doing nothing? Why waiting threads can't be "reused"?

Upvotes: 0

Views: 1239

Answers (3)

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275740

A thread, roughly, is a CPU state and reserved memory space for a stack, plus an entry in an OS scheduler. The C++ language also has information about per-thread state (thread_local), and helper libraries may also have some state.

These are reasonably expensive. This information cannot be shared between threads; each thread actually has a different stack, a different set of thread_local state, different register values, etc.

Now, when a thread isn't executing, it is just an entry in a table. No CPU resources (other than those caused by a larger table) are spent on the thread. So you have a large amount of setup costs, a bunch of threads are started, then they go to sleep. The scheduler doesn't return to those threads until the time they asked to sleep comes up.

So at the hardware level, they are sharing CPUs. But at the software level, their state isn't shared, and that is what you are seeing in the debugger.

Upvotes: 2

MSalters
MSalters

Reputation: 179991

The sharing does happen, but at core level, not thread level. Since your threads are doing virtually no computation, it's likely all 160 threads can share a single CPU core.

Fundamentally, a thread holds a call stack, with the local variables of each function invocation. This stack can't really be shared - the fundamental property of a call stack is that the top function is the one actively executing. In your example, you have 160 sleep_for on top of 160 stacks.

Upvotes: 3

Jeffrey
Jeffrey

Reputation: 11430

The important question is: what observable difference would it make to your program? The standard won't talk to what happens at a lowel system level. It will only talk about observable behaviour. There's no gain there, the only observable difference could be unexpected thread local storage variables mixup.

Consider the complexity:

  • sleeping threads don't cost much to the system. Having more idle threads won't hurt much
  • busy threads can't be reused. Well, not without cost.
  • if you wanted to reuse idle threads, how would you know that the reused thread would not become busy after sleeping.

So, in short, it would offer no visible benefit, could break thread local storage, depending on how it is stated in the spec, and would be a major pain to implement. Only for the sake of reducing the number of threads at a lower level.

Upvotes: 1

Related Questions