Reputation: 255
My code for calculating clock cycles for creating thread is
# include <Windows.h>
# include <stdio.h>
# include <conio.h>
# define RUN 20000
DWORD WINAPI Calc(LPVOID Param){}
int main(int argc, char* agrv[])
{
ULONG64 Sum = 0;
for (int i = 0; i < RUN; i++)
{
ULONG64 ret = 0;
DWORD ThreadId;
HANDLE ThreadHandle;
/* create the thread */
ThreadHandle = CreateThread(
NULL, /* default security attributes */
0, /* default stack size */
Calc, /* thread function */
NULL, /* parameter to thread function */
0, /* default creation flags */
&ThreadId); /* returns the thread identifier */
QueryThreadCycleTime(ThreadHandle, &ret);
WaitForSingleObject(ThreadHandle, INFINITE);
CloseHandle(ThreadHandle);
Sum += ret;
}
printf_s("The Average no of cycles in %d runs is %lu\n", RUN, (DWORD)(Sum/RUN));
_getch();
return 0;
}
The results for this code is round about 1000 clock cycles on my modest laptop. But if I call the QueryThreadCycleTime function after the WaitForSingleObject function the result is very different and in the order of 200,000. I looked around a lot but didn't really found an explanation. What is the reason for such behavior?
Upvotes: 0
Views: 359
Reputation: 613382
The difference is whether or not you wait for the thread to complete execution. Clearly if you wait for that, then doing so will allow the thread to use more clock cycles.
Note that you are not timing the process of creating the thread. You are timing execution of the thread procedure. Let me be clear, the value returned by QueryThreadCycleTime
is the number of cycles consumed executing the thread, and not the number of cycles spent executing CreateThread
, or indeed the elapsed wall clock time from calling CreateThread
to the thread starting execution.
And you are doing so at an indeterminate point. For instance, in your code, QueryThreadCycleTime
sometimes returns 0
because the thread has not even started executing at the point where the main thread calls QueryThreadCycleTime
.
If you want to time thread creation then time how long it takes for CreateThread
to return. Or even better, measure the wall clock time that elapses between the call to CreateThread
being made, and the thread starting execution.
For instance, the code might look like this:
# include <Windows.h>
# include <stdio.h>
# include <conio.h>
# define RUN 100000
DWORD WINAPI Calc(LPVOID Param){
QueryPerformanceCounter((LARGE_INTEGER*)Param);
}
int main(int argc, char* agrv[])
{
ULONG64 Sum = 0;
for (int i = 0; i < RUN; i++)
{
LARGE_INTEGER PerformanceCountBeforeCreateThread, PerformanceCountWhenThreadStartsExecuting;
DWORD ThreadId;
HANDLE ThreadHandle;
/* create the thread */
QueryPerformanceCounter(&PerformanceCountBeforeCreateThread);
ThreadHandle = CreateThread(NULL, 0, Calc,
&PerformanceCountWhenThreadStartsExecuting, 0, &ThreadId);
WaitForSingleObject(ThreadHandle, INFINITE);
CloseHandle(ThreadHandle);
Sum += PerformanceCountWhenThreadStartsExecuting.QuadPart - PerformanceCountBeforeCreateThread.QuadPart;
}
printf_s("The Average no of counts in %d runs is %lu\n", RUN, (DWORD)(Sum/RUN));
LARGE_INTEGER Frequency;
QueryPerformanceFrequency(&Frequency);
printf_s("Frequency %lu\n", Frequency.QuadPart);
_getch();
return 0;
}
Upvotes: 2