Jeyekomon
Jeyekomon

Reputation: 3391

Speed of printf()

I was having some fun in C language with time.h library, trying to measure the number of clock ticks of some basic functions, just to figure out how fast they actually are. I used the clock() function. In this case I was measuring the printf() function.

Look at my program:

#include <stdio.h>
#include <time.h>

void main()
{
    const int LIMIT = 2000;
    const int LOOP = 20;
    int results[LOOP];

    for(int i=0; i<LOOP; i++)
    {
        int j;
        clock_t time01 = clock();

        for(j=1; j<LIMIT; j++)
        {
            printf("a");
        }

        clock_t time02 = clock();
        results[i] = (int) (time02 - time01);
    }

    for(int i=0; i<LOOP; i++)
    {
        printf("\nCLOCK TIME: %d.", results[i]);        
    }
    getchar();
}

The program just basically counts 20 times the number of clock ticks of 2000 times called printf("a") function.

The strange thing I don't understand is the result. I get most of the time, even when doing other tests, randomly two groups of results:

CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.

I don't understand how exactly compiler handles that function. There is some test for % character I guess, but that wouldn't make that difference. Looks more like compiler is doing something in the memory... (?) Does anyone know the precise background of compiling this code or why there appears that difference mentioned above? Or at least some link that would help me?

Thank you.

Upvotes: 2

Views: 3420

Answers (3)

Coren
Coren

Reputation: 5637

man page of clock said that it returns an

approximation of processor time used by the program

This approxmation is based on the famous Time Stamp Counter. As wikipedia says :

It counts the number of cycles since reset

Sadly, nowadays, this counter can vary between core.

There is no promise that the timestamp counters of multiple CPUs on a single motherboard will be synchronized.

So beware to lock your code on a certain cpu, otherwise, you will continue to have strange results. And since you seems to search precise results, you can use this code instead of clock call :

  uint64_t rdtsc(void) {
    uint32_t lo, hi;
    __asm__ __volatile__ (      // serialize
    "xorl %%eax,%%eax \n        cpuid"
    ::: "%rax", "%rbx", "%rcx", "%rdx");
    /* We cannot use "=A", since this would use %rax on x86_64 and return only the lower 32bits of the TSC */
    __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
    return (uint64_t)hi << 32 | lo;
  }

Upvotes: -1

Some compilers (in particular recent versions of gcc on recent Linux distributions, when optimizing with -O2) are able to optimize printf("a") into code very similar to putchar(a)

But most of the time is spent in the kernel doing the write system call.

Upvotes: 1

Oliver Charlesworth
Oliver Charlesworth

Reputation: 272467

I can think of at least two possible causes:

  1. Your clock has limited resolution.
  2. printf will occasionally be flushing its buffer.

Upvotes: 2

Related Questions