Usnish Roy
Usnish Roy

Reputation: 9

program which will find out the performance/property of other programs.

I need to write a program which will find out the performance/property of other programs.

For example, lets say, I want to know what are the functions are getting called when i give top command in linux. I need to know the which function in top command is called first and which function is responsible for printing output.

Or it can be thought in another way, I want to know what are the functions are getting called in a program, in which order,and how much time it spend for its execution.

Upvotes: 0

Views: 52

Answers (2)

Ragnar
Ragnar

Reputation: 1182

If you're just interested in system calls you can use a trace tool like strace:

strace -tt -o strace.log -C top

Will run "top" and log system calls including timings to the file "strace.log", including a summary at the end showing how much time was spent doing various calls. But this will of course not capture what happens in the application code itself - for that you need real profiling like described in the previous response. for a quick-and-dirty view of what happens in a running binary, though, a trace program can be a good tool.

Upvotes: 0

5gon12eder
5gon12eder

Reputation: 25459

There are two popular approaches to accumulate profiling information about a program. You can either use a dedicated profiler or manually instrument your code with calls to clock. In both cases you will have to re-compile your code but only in the second, you will have to modify your source code.

Using Tool Support

If you are using the GNU toolchain, you are looking for Gprof. It comes together with GCC and does just what you've asked for. It works in three steps:

  1. You compile your program with the special option -pg. This will cause GCC to instrument your binary with special instructions that accumulate the profiling statistics.
  2. You run your program on a representative input set. At program exit, profiling statistics will be written to special files for later analysis.
  3. You run (in the terminal) gprof main > profile.txt where you substitute your executable name for main. This will output human-readable profiling data to the file profile.txt.

The file contains two sections: the “flat profile” and the “call graph”. Quoting the manual:

The flat profile shows the total amount of time your program spent executing each function. […]

This is part of a flat profile for a small program:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 33.34      0.02     0.02     7208     0.00     0.00  open
 16.67      0.03     0.01      244     0.04     0.12  offtime
 16.67      0.04     0.01        8     1.25     1.25  memccpy
 16.67      0.05     0.01        7     1.43     1.43  write
 16.67      0.06     0.01                             mcount
  0.00      0.06     0.00      236     0.00     0.00  tzset
  0.00      0.06     0.00      192     0.00     0.00  tolower
  0.00      0.06     0.00       47     0.00     0.00  strlen
  0.00      0.06     0.00       45     0.00     0.00  strchr
  0.00      0.06     0.00        1     0.00    50.00  main
  0.00      0.06     0.00        1     0.00     0.00  memcpy
  0.00      0.06     0.00        1     0.00    10.11  print
  0.00      0.06     0.00        1     0.00     0.00  profil
  0.00      0.06     0.00        1     0.00    50.00  report
...

The functions are sorted first by decreasing run-time spent in them, then by decreasing number of calls, then alphabetically by name. The functions mcount and profil are part of the profiling apparatus and appear in every flat profile; their time gives a measure of the amount of overhead due to profiling.

And quoting the manual again:

The call graph shows how much time was spent in each function and its children. From this information, you can find functions that, while they themselves may not have used much time, called other functions that did use unusual amounts of time.

Here is a sample call from a small program. This call came from the same gprof run as the flat profile example in the previous section.

granularity: each sample hit covers 2 byte(s) for 20.00% of 0.05 seconds

index % time    self  children    called     name
                                                 <spontaneous>
[1]    100.0    0.00    0.05                 start [1]
                0.00    0.05       1/1           main [2]
                0.00    0.00       1/2           on_exit [28]
                0.00    0.00       1/1           exit [59]
-----------------------------------------------
                0.00    0.05       1/1           start [1]
[2]    100.0    0.00    0.05       1         main [2]
                0.00    0.05       1/1           report [3]
-----------------------------------------------
                0.00    0.05       1/1           main [2]
[3]    100.0    0.00    0.05       1         report [3]
                0.00    0.03       8/8           timelocal [6]
                0.00    0.01       1/1           print [9]
                0.00    0.01       9/9           fgets [12]
                0.00    0.00      12/34          strncmp <cycle 1> [40]
                0.00    0.00       8/8           lookup [20]
                0.00    0.00       1/1           fopen [21]
                0.00    0.00       8/8           chewtime [24]
                0.00    0.00       8/16          skipspace [44]
-----------------------------------------------
[4]     59.8    0.01        0.02       8+472     <cycle 2 as a whole> [4]
                0.01        0.02     244+260         offtime <cycle 2> [7]
                0.00        0.00     236+1           tzset <cycle 2> [26]
-----------------------------------------------

The lines full of dashes divide this table into entries, one for each function. Each entry has one or more lines.

In each entry, the primary line is the one that starts with an index number in square brackets. The end of this line says which function the entry is for. The preceding lines in the entry describe the callers of this function and the following lines describe its subroutines (also called children when we speak of the call graph).

The entries are sorted by time spent in the function and its subroutines.

I hope this could give you an idea and recommend you read through the manual yourself to find out how to use Gprof effectively. A final great thing about Gprof is that you can feed the statistics it gathered back into GCC to let it optimize your code better. This is known as profile guided optimization (PGO).

Doing it by Hand

If you insist in not using tools like Gprof, you can of course do things by hand but I recommend you stay away from this if all possible. Why would you re-invent a flat tire if you can have a wheel for free? Of course, there are tradeoffs for everything and in some situations, you might only want to profile a few functions and the manual approach might be warranted or even superior. (Often times, it won't.)

Anyway, you could decide what functions are interesting to you. For example, let's assume we have the functions foo , bar and baz that we want to profile. You start by numbering them, preferably in an enumeration.

enum ProfiledFunctions {
  PROF_FOO,
  PROF_BAR,
  PROF_BAZ,
  PROF_LENGTH,  // dummy, counts the number of functions
};

Then you create a global array to record the time spent in each function.

#include <time.h>

static clock_t profile_data[PROF_LENGTH];

And instrument all your functions.

int
foo(int a, int b)
{
#if COLLECT_PROFILING_DATA
  clock_t t_begin, t_end;
  t_begin = clock();
#endif
  // Do useful stuff with 'a' and 'b'...
#if COLLECT_PROFILING_DATA
  t_end = clock();
  profile_data[PROF_FOO] += t_end - t_begin;
#endif
  return 0;
}

Of course, you do the same to bar and baz as well. You might want to use some preprocessor magic to reduce the amount of typing redundant boilerplate code.

I have surrounded the profiling logic with conditionals so we can compile them in and out as we see fit (using -DCOLLECT_PROFILING_DATA=1 on the compiler command line). The last thing you want to do is going over your code more than once adding / removing this boilerplate logic. (The second last thing you want to do is doing this even once which lets me refer back to tools like Gprof.)

Finally, you go and change your main.

int
main()
{
#if COLLECT_PROFILING_DATA
  clock_t t_begin, t_end;
  t_begin = clock();
#endif
  // Do what your program is supposed to do...
#if COLLECT_PROFILING_DATA
  t_end = clock();
  fprintf(stderr, "foo:  %ju\n", (uintmax_t) profile_data[PROF_FOO]);
  fprintf(stderr, "bar:  %ju\n", (uintmax_t) profile_data[PROF_BAR]);
  fprintf(stderr, "baz:  %ju\n", (uintmax_t) profile_data[PROF_BAZ]);
  fprintf(stderr, "main: %ju\n", (uintmax_t) (t_end - t_begin));
#endif
  return 0;
}

I'm also printing the time spent in main – that is, the program as a whole – so you can tell whether the function you have chosen to profile actually cover most of the execution time. In this timing, I have excluded the profiling overhead for printing the statistics themselves as it would be misleading.

The statistics gathered by this approach will be cumulative. If foo calls bar (and nobody else calls bar) then the time reported for foo will include all time spent in bar plus whatever else foo did. If a function calls itself recursively, you will get false data, leave alone with functions that call longjmp or similar. If your functions do this kind of thing, you really want to use a dedicated profiler tool.

Upvotes: 1

Related Questions