std::thread runs A LOT slower than std::future

Question

I have some simple rendering program with a Mainloop that runs at about 8000 fps on one thread (it does nothing except draw a background) and I wanted to see if another thread rendering would upset the current context without changing it (it didn't to my surprise). I achieved this with this simple code here,

m_Thread = std::thread(Mainloop);
m_Thread.join();

and this code here somehow ran extremely slow, ~30 FPS. I thought this was weird and I remembered in another project I used std::future for a similar performance-based reason. So I then tried it with std::future using the following code:

m_Future = std::async(std::launch::async, Mainloop);
m_Future.get();

and this runs just a tiny bit below the single-threaded performance (~7900) fps. Why is std::thread so much slower than std::future?

Edit:

Disregard the above code, here is a minimal reproducable example, just toggle THREAD to be either 0 or 1 to compare:

#include 
#include 
#include 
#include 
#include 

#define THREAD 1

static void Function()
{
    
}

int main()
{
    std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();
    std::chrono::high_resolution_clock::time_point finish = std::chrono::high_resolution_clock::now();
    long double difference = 0;
    long long unsigned int fps = 0;

#if THREAD
    std::thread worker;
#else
    std::future worker;
#endif

    while (true)
    {
        //FPS 
        finish = std::chrono::high_resolution_clock::now();
        difference = std::chrono::duration_cast(finish - start).count();
        difference = difference / 1000000000;
        if (difference > 0.1) {
            start = std::chrono::high_resolution_clock::now();
            std::wstring fpsStr = L"Fps: ";
            fpsStr += std::to_wstring(fps);
            SetConsoleTitle(fpsStr.c_str());
            fps = 0;
        }
        
#if THREAD
        worker = std::thread(Function);
        worker.join();
#else
        worker = std::async(std::launch::async, Function);
        worker.get();
#endif

        fps += 10;
    }

    return 0;
}

Dmitry Kuzminov · Accepted Answer

The std::async can be implemented in different ways. For example there can be a pre-allocated pool of threads, and each time you use the std::async in a loop you just reuse a "hot" thread from the pool.

The std::thread creates a new system thread object each time you use it. That may be a significant overhead to compare to reusing a thread from the pool.

I would advise you to test your code in a multithreaded environment where std::async may start competing for the pre-allocated system objects.

std::thread runs A LOT slower than std::future

Edit:

Answers (2)

Related Questions