Reputation: 361
For reasons explained below I have started to investigate the time it takes to create and run a thread. The way I do it, I found this process to take about 26 ms for 10 threads which is much longer than it should be - at least from my understanding.
A short background:
I'm working on a game that uses pathfinding. After adding more entities it became necessary to parallise the process.
I want this to be as readable as possible so I've created a ParallelTask class that holds a thread, std::function (that should be executed by the tread), a mutex to protect some write operations and a bool is completed that is set to true once the thread has finished executing.
I'm new to multithreading so I have no idea if this is a good approach to begin with but never the less I'm confused why it takes so long to execute.
I have written the code below to isolate the problem.
int main()
{
std::map<int, std::unique_ptr<ParallelTask>> parallelTaskDictionary;
auto start = std::chrono::system_clock::now();
for (size_t i = 0; i < 10; i++)
{
parallelTaskDictionary.emplace(i, std::make_unique<ParallelTask>());
parallelTaskDictionary[i]->Execute();
}
auto end = std::chrono::system_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
std::cout << elapsed.count() << std::endl;
parallelTaskDictionary.clear();
return 0;
}
class ParallelTask
{
public:
ParallelTask();
// Join the treads
~ParallelTask();
public:
inline std::vector<int> GetPath() const { return path; }
void Execute();
private:
std::thread thread;
mutable std::mutex mutex;
std::function<void()> threadFunction;
bool completed;
std::vector<int> path;
};
ParallelTask::ParallelTask()
{
threadFunction = [this]() {
{
std::lock_guard<std::mutex> lock(mutex);
this->completed = true;
}
};
}
ParallelTask::~ParallelTask()
{
if (thread.joinable())
thread.join();
}
void ParallelTask::Execute()
{
this->completed = false;
// Launch the thread
this->thread = std::thread(threadFunction);
}
Running this code gives me between 25 and 26 milliseconds of execution time. Since this is meant to be used in a game its of course inacceptable.
As previously mentioned, I do not understand why, especially since the threadFunction itself does literally noting. In case you wonder, I have even removed the mutex lock and it gave me literally the same result so there must be something else going on here. (From my research creating a thread shouldn't take more than a couple microseconds but maybe I'm just wrong with that ^^)
PS: Oh yeah and while we are at it, I still don't really understand who should own the mutex. (Is there one global or one per object...)???
Upvotes: 0
Views: 75
Reputation: 3461
If you want to measure the time of execution only, I think you should put the now and end statements inside the threadFunction
only where the work is done, as shown in the code below.
#include <map>
#include <iostream>
#include <memory>
#include <chrono>
#include <vector>
#include <thread>
#include <mutex>
#include <functional>
class ParallelTask
{
public:
ParallelTask();
// Join the treads
~ParallelTask();
public:
inline std::vector<int> GetPath() const { return path; }
void Execute();
private:
std::thread thread;
mutable std::mutex mutex;
std::function<void()> threadFunction;
bool completed;
std::vector<int> path;
};
ParallelTask::ParallelTask()
{
threadFunction = [this]() {
{
auto start = std::chrono::system_clock::now();
std::lock_guard<std::mutex> lock(mutex);
this->completed = true;
auto end = std::chrono::system_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
std::cout << "elapsed time" << elapsed.count() << std::endl;
}
};
}
ParallelTask::~ParallelTask()
{
if (thread.joinable())
thread.join();
}
void ParallelTask::Execute()
{
this->completed = false;
// Launch the thread
this->thread = std::thread(threadFunction);
}
int main()
{
std::map<int, std::unique_ptr<ParallelTask>> parallelTaskDictionary;
for (size_t i = 0; i < 10; i++)
{
parallelTaskDictionary.emplace(i, std::make_unique<ParallelTask>());
parallelTaskDictionary[i]->Execute();
}
parallelTaskDictionary.clear();
return 0;
}
which gives an output:
elapsed time1
elapsed time0
elapsed time0
elapsed time0
elapsed time0
elapsed time0elapsed time
0
elapsed time0
elapsed time0
elapsed time0
Because we exclude the time it takes to spin up the thread.
And just as a sanity check, if you really want to see the effect of real work, you could add,
using namespace std::chrono_literals;
std::this_thread::sleep_for(2s);
to your threadFunction
, to make it look like this
ParallelTask::ParallelTask()
{
threadFunction = [this]() {
{
auto start = std::chrono::system_clock::now();
std::lock_guard<std::mutex> lock(mutex);
this->completed = true;
using namespace std::chrono_literals;
std::this_thread::sleep_for(2s);
auto end = std::chrono::system_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
std::cout << "elapsed time" << elapsed.count() << std::endl;
}
};
}
and the output will be,
elapsed time2000061
elapsed timeelapsed time2000103
elapsed timeelapsed time20000222000061
elapsed time2000050
2000072
elapsed time2000061
elapsed time200012
Upvotes: 1