Reputation: 3391
I was having some fun in C language with time.h library, trying to measure the number of clock ticks of some basic functions, just to figure out how fast they actually are. I used the clock() function. In this case I was measuring the printf() function.
Look at my program:
#include <stdio.h>
#include <time.h>
void main()
{
const int LIMIT = 2000;
const int LOOP = 20;
int results[LOOP];
for(int i=0; i<LOOP; i++)
{
int j;
clock_t time01 = clock();
for(j=1; j<LIMIT; j++)
{
printf("a");
}
clock_t time02 = clock();
results[i] = (int) (time02 - time01);
}
for(int i=0; i<LOOP; i++)
{
printf("\nCLOCK TIME: %d.", results[i]);
}
getchar();
}
The program just basically counts 20 times the number of clock ticks of 2000 times called printf("a") function.
The strange thing I don't understand is the result. I get most of the time, even when doing other tests, randomly two groups of results:
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
I don't understand how exactly compiler handles that function. There is some test for % character I guess, but that wouldn't make that difference. Looks more like compiler is doing something in the memory... (?) Does anyone know the precise background of compiling this code or why there appears that difference mentioned above? Or at least some link that would help me?
Thank you.
Upvotes: 2
Views: 3420
Reputation: 5637
man page of clock
said that it returns an
approximation of processor time used by the program
This approxmation is based on the famous Time Stamp Counter. As wikipedia says :
It counts the number of cycles since reset
Sadly, nowadays, this counter can vary between core.
There is no promise that the timestamp counters of multiple CPUs on a single motherboard will be synchronized.
So beware to lock your code on a certain cpu, otherwise, you will continue to have strange results. And since you seems to search precise results, you can use this code instead of clock
call :
uint64_t rdtsc(void) {
uint32_t lo, hi;
__asm__ __volatile__ ( // serialize
"xorl %%eax,%%eax \n cpuid"
::: "%rax", "%rbx", "%rcx", "%rdx");
/* We cannot use "=A", since this would use %rax on x86_64 and return only the lower 32bits of the TSC */
__asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
return (uint64_t)hi << 32 | lo;
}
Upvotes: -1
Reputation: 1
Some compilers (in particular recent versions of gcc
on recent Linux distributions, when optimizing with -O2
) are able to optimize printf("a")
into code very similar to putchar(
a)
But most of the time is spent in the kernel doing the write
system call.
Upvotes: 1
Reputation: 272467
I can think of at least two possible causes:
printf
will occasionally be flushing its buffer.Upvotes: 2