Reputation: 1594
I'm trying to learn how Threads and mutexes work however I'm running into a bit of a confusion hole right now, I took the following code from the official SFML 1.6 tutorials:
#include <SFML/System.hpp>
#include <iostream>
void ThreadFunction(void* UserData)
{
// Print something...
for (int i = 0; i < 10; ++i)
std::cout << "I'm the thread number 1" << std::endl;
}
int main()
{
// Create a thread with our function
sf::Thread Thread(&ThreadFunction);
// Start it !
Thread.Launch();
// Print something...
for (int i = 0; i < 10; ++i)
std::cout << "I'm the main thread" << std::endl;
return EXIT_SUCCESS;
}
And it said
So the text from both threads will be displayed at the same time.
However that's not happening, it first executes the first thread then the second thread, aren't they supposed to run at the same time? I'm using Codeblocks IDE on Windows XP SP3, running SFML 1.6. Am I doing something wrong, or have I misunderstood how they work? From my point of view, threads are supposed to execute at the same time, so the output should be something like
"text from thread 1 text from thread 2 text from thread 1 and so on"
Upvotes: 1
Views: 1004
Reputation: 67723
... aren't they supposed to run at the same time?
Well, that depends.
If you have two or more cores, they may run concurrently.
Even if you have the available hardware, it's up to your OS to decide how to schedule your threads: if you want to encourage your OS to interleave both threads (you can't force it without more work), try adding sleep
or nanosleep
or yield
calls to your loops (the exact primitives will depend on your platform).
If it helps you build an intuition about how and why a kernel will make scheduling decisions, note that most CPU architectures will keep a significant amount of state (branch prediction tables, data and instruction caches) that is really good at optimizing a single thread of execution.
Therefore, it's generally more efficient to let a given thread run on a given core for as long as possible, to minimize the number of avoidable context switches, cache misses and mis-predictions.
Now, timeslicing is often used as a sort of tradeoff between the best throughput for each individual process, and the best latency or responsiveness to external events. A thread may block (by waiting for an external event such as user input or device I/O, because it explicitly synchronizes with another thread, or explicitly sleeps or yields), in which case another thread will be scheduled while the first can't make progress, but otherwise it will typically run until the kernel pre-empts it at the end of its allotted time slice.
When the parent thread creates a child thread, I wouldn't like to guess which is "hotter" on the current core, so letting the parent finish its timeslice (unless it blocks) is a reasonable default.
The child thread is probably runnable right away, but if it doesn't pre-empt the parent thread, it isn't obvious why it should immediately pre-empt a thread on a different core either. After all, it's still in the same process as the parent thread, and shares the same memory, address maps and other resources: unless another core is completely idle, the best place to schedule the child is probably on the same core as its parent, because there's a decent chance the parent kept those shared resources warm in the cache there.
So, the reason your threads don't get interleaved is likely that neither runs for an appreciable fraction of a timeslice before the process exits, and neither does any blocking I/O or explicitly yields (stdout isn't blocking for that amount of data, as it'll easily be buffered).
Upvotes: 4