Reputation: 1021
I wrote a String sorting program in C++ and my question is which is basically read some string chunks from a txt file and put into a vector and sort the strings. First I measured the sort execution time using a single threaded program. Then I divide the vector into two protions and sorted them using two threads. But the problem is that multi threaded program took more time to execute than the single threaded program. Can someone explain me what is the reason?.. Thanks.
By the way the string vector contains 20 characters long 1 million string records and I have not added the file read function.
Single Thread Program..
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <algorithm>
#include <intrin.h>
#pragma intrinsic(__rdtsc)
using namespace std;
vector<string> ReadFile();
int main()
{
vector<string> RandomStringVector;
unsigned __int64 t1,t2;
RandomStringVector = ReadFile();
t1 = __rdtsc();
sort(RandomStringVector.begin(),RandomStringVector.end());
t2 = __rdtsc();
printf_s("%I64d ticks\n", (t2 - t1)/1000000);
system("pause");
return 0;
}
This is the Multi Threaded Program
#include <process.h>
#include <windows.h>
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <algorithm>
#include <intrin.h>
#pragma intrinsic(__rdtsc)
using namespace std;
void SortString(void * arg);
vector<string> ReadFile();
int main(){
vector<string> FullStringVector;
FullStringVector = ReadFile();
vector<string> v1(FullStringVector.begin(), FullStringVector.begin() + FullStringVector.size()/2),
v2(FullStringVector.begin() + FullStringVector.size()/2 +1, FullStringVector.end());
_beginthread( SortString, 0, (void *)&v1);
_beginthread( SortString, 0, (void *)&v2);
system("pause");
return 0;
}
void SortString(void *arg)
{
unsigned __int64 t1,t2;
vector<string> * StringVector;
vector<string> SortedStringVector;
StringVector = (vector<string> *)arg;
SortedStringVector = *StringVector;
t1 = __rdtsc();
sort(SortedStringVector.begin(),SortedStringVector.end());
t2 = __rdtsc();
printf_s("%I64d ticks\n", (t2 - t1)/1000000);
}
Upvotes: 2
Views: 2145
Reputation: 15768
First of all, you are using processor-clock ticks to measure the performance, and by that measure any multi-threaded algorithm will be slower than an equivalent single-threaded algorithm. The reason is because this measure effectively counts the number of instructions that gets executed and threading always adds a little overhead to an algorithm.
To get a proper measurement of performance, you need to measure the wall-clock time. This way, the measurement can accurately reflect the work that gets done in parallel by different cores/processors.
Also, when adapting an algorithm for parallel execution, you need to make sure that the algorithm remains consistent (in your question, the single-threaded and multi-threaded sorts are not identical. The multi-threaded version needs an additional pass to sort the two halves together) and that there is not too much communication overhead between the threads (not really an issue here, but consider an algorithm where thread X needs the result from thread Y before it can calculate its own result).
Upvotes: 3